Abstract
An indirect association refers to an infrequent itempair, each item of which is highly co-occurring with a frequent itemset called “mediator”. Although indirect associations have been recognized as powerful patterns in revealing interesting information hidden in many applications, such as recommendation ranking, substitute items or competitive items, and common web navigation path, etc., almost no work, to our knowledge, has investigated how to discover this type of patterns from streaming data. In this paper, the problem of mining indirect associations from data streams is considered. Unlike contemporary research work on stream data mining that investigates the problem individually from different types of streaming models, we treat the problem in a generic way. We propose a generic window model that can represent all classical streaming models and retain user flexibility in defining new models. In this context, a generic algorithm is developed, which guarantees no false positive rules and bounded support error as long as the window model is specifiable by the proposed generic model. Comprehensive experiments on both synthetic and real datasets have showed the effectiveness of the proposed approach as a generic way for finding indirect association rules over streaming data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal, C.: Data Streams: Models and Algorithms. Springer, Heidelberg (2007)
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: 20th Int. Conf. Very Large Data Bases, pp. 487–499 (1994)
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and Issues in Data Stream Systems. In: 21st ACM Symp. Principles of Database Systems, pp. 1–16 (2002)
Cadez, I., Heckerman, D., Meek, C., Smyth, P., White, S.: Visualization of Navigation Patterns on a Web Site Using Model-based Clustering. In: 6th ACM Int. Conf. Knowledge Discovery and Data Mining, pp. 280–284 (2000)
Chang, J.H., Lee, W.S.: Finding Recent Frequent Itemsets Adaptively over Online Data Streams. In: 9th ACM Int. Conf. Knowledge Discovery and Data Mining, pp. 487–492 (2003)
Chang, J.H., Lee, W.S.: estWin: Adaptively Monitoring the Recent Change of Frequent Itemsets over Online Data Streams. In: 12th ACM Int. Conf. Information and Knowledge Management, pp. 536–539 (2003)
Chen, L., Bhowmick, S.S., Li, J.: Mining Temporal Indirect Associations. In: 10th Pacific-Asia Conf. Knowledge Discovery and Data Mining, pp. 425–434 (2006)
Chi, Y., Wung, H., Yu, P.S., Muntz, R.R.: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window. In: 4th IEEE Int. Conf. Data Mining, pp. 59–66 (2004)
Gaber, M.M., Zaslavsky, A., Krishnaswamy, S.: Towards an Adaptive Approach for Mining Data Streams in Resource Constrained Environments. In: 6th Int. Conf. Data Warehousing and Knowledge Discovery, pp. 189–198 (2004)
Hidber, C.: Online Association Rule Mining. ACM SIGMOD Record 28(2), 145–156 (1999)
Jiang, N., Gruenwald, L.: CFI-stream: Mining Closed Frequent Itemsets in Data Streams. In: Proc. 12th ACM Int. Conf. Knowledge Discovery and Data Mining, pp. 592–597 (2006)
Jin, R., Agrawal, G.: An Algorithm for In-core Frequent Itemset Mining on Streaming Data. In: 5th IEEE Int. Conf. Data Mining, pp. 210–217 (2005)
Kazienko, P.: IDRAM—Mining of Indirect Association Rules. In: Int. Conf. Intelligent Information Processing and Web Mining, pp. 77–86 (2005)
Kazienko, P., Kuzminska, K.: The Influence of Indirect Association Rules on Recommendation Ranking Lists. In: 5th Int. Conf. Intelligent Systems Design and Applications, pp. 482–487 (2005)
Koh, J.L., Shin, S.N.: An Approximate Approach for Mining Recently Frequent Itemsets from Data Streams. In: 8th Int. Conf. Data Warehousing and Knowledge Discovery, pp. 352–362 (2006)
Lee, D., Lee, W.: Finding Maximal Frequent Itemsets over Online Data Streams Adaptively. In: 5th IEEE Int. Conf. Data Mining, pp. 266–273 (2005)
Li, H.F., Lee, S.Y., Shan, M.K.: An Efficient Algorithm for Mining Frequent Itemsets over the Entire History of Data Streams. In: 1st Int. Workshop Knowledge Discovery in Data Streams, pp. 20–24 (2004)
Lin, C.H., Chiu, D.Y., Wu, Y.H., Chen, A.L.P.: Mining Frequent Itemsets from Data Streams with a Time-Sensitive Sliding Window. In: 5th SIAM Data Mining Conf., pp. 68–79 (2005)
Manku, G.S., Motwani, R.: Approximate Frequency Counts over Data Streams. In: 28th Int. Conf. Very Large Data Bases, pp. 346–357 (2002)
Tan, P.N., Kumar, V.: Mining Indirect Associations in Web Data. In: 3rd Int. Workshop Mining Web Log Data Across All Customers Touch Points, pp. 145–166 (2001)
Tan, P.N., Kumar, V., Srivastava, J.: Indirect Association: Mining Higher Order Dependencies in Data. In: 4th European Conf. Principles of Data Mining and Knowledge Discovery, pp. 632–637 (2000)
Teng, W.G., Chen, M.S., Yu, P.S.: Resource-Aware Mining with Variable Granularities in Data Streams. In: 4th SIAM Conf. Data Mining, pp. 527–531 (2004)
Teng, W.G., Hsieh, M.J., Chen, M.S.: On the Mining of Substitution Rules for Statistically Dependent Items. In: 2nd IEEE Int. Conf. Data Mining, pp. 442–449 (2002)
Wan, Q., An, A.: An Efficient Approach to Mining Indirect Associations. Journal of Intelligent Information System 27(2), 135–158 (2006)
Yu, J.X., Chong, Z., Lu, H., Zhou, A.: False Positive or False Negative: Mining Frequent Itemsets from High Speed Transactional Data Streams. In: 30th Int. Conf. Very Large Data Bases, pp. 204–215 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lin, WY., Wei, YE., Chen, CH. (2011). A Generic Approach for Mining Indirect Association Rules in Data Streams. In: Mehrotra, K.G., Mohan, C.K., Oh, J.C., Varshney, P.K., Ali, M. (eds) Modern Approaches in Applied Intelligence. IEA/AIE 2011. Lecture Notes in Computer Science(), vol 6703. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21822-4_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-21822-4_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21821-7
Online ISBN: 978-3-642-21822-4
eBook Packages: Computer ScienceComputer Science (R0)