Real World Association Rule Mining
Across a wide variety of fields, data are being collected and accumulated at a dramatic pace, and therefore a new generation of techniques has been proposed to assist humans in extracting usefull information from the rapidly growing volumes of data. One of these techniques is the association rule discovery, a key data mining task which has attracted tremendous interest among data mining researchers. Due to its vast applicability, many algorithms have been developed to perform the association rule mining task. However, an immediate problem facing researchers is which of these algorithms is likely to make a good match with the database to be used in the mining operation. In this paper we consider this problem, dealing with both algorithmic and data aspects of association rule mining by performing a systematic experimental evaluation of different algorithms on different databases. We observed that each algorithm has different strengths and weaknesses depending on data characteristics. This careful analysis enables us to develop an algorithm which achieves better performance than previously proposed algorithms, specially on databases obtained from actual applications.
Unable to display preview. Download preview PDF.
- 1.R. Agrawal, T. Imielinski, A. Swami. Mining Association Rules between Sets of Items in Large Databases. In Proc. of the Conf. on Management of Data, 1993.Google Scholar
- 2.R. Agrawal, A. Swami. Fast Algorithms for Mining Association Rules. In Proc. of the 20th Intl. Conf. on Very Large Data Bases, September 1994.Google Scholar
- 3.S. Brin, R. Motwani, J. Ullman. Dynamic Itemset Counting and Implication Rules for Market Basket Data. In Proc. of the Intl. Conf. on Management of Data, 1997.Google Scholar
- 4.M. Zaki, S. Parthasarathy, W. Li. Algorithms for Fast Discovery of Association Rules. In Proc. of the 3rd Conf. on Knowledge Discovery and Data Mining, 1997.Google Scholar
- 5.M. Zaki, C. Hsiao. ChARM: An Efficient Algorithm for Closed association Rule Mining. Technical Report 99-10, Rensselaer Polytechnic Institute, October 2000.Google Scholar
- 6.K. Gouda, M. Zaki. Efficiently Mining Maximal Frequent Itemsets. In 1st IEEE International Conference on Data Mining, November 2001.Google Scholar
- 7.R. J. Bayardo. Efficiently mining long patterns from databases. In ACM-SIGMOD Intl. Conf. on Management of Data, June 1998.Google Scholar
- 8.D. Burdick, M. Calimlim, J. Gehrke. MAFIA: A maximal frequent itemset algorithm for transactional databases. In Intl. Conf. on Data Engineering, April, 2001.Google Scholar
- 9.Y. Bastide, R. Taouil, N. Pasquier, G. Stumme, L. Lakhal. Mining Frequent Patterns with Counting Inference. SIGKDD Explorations, December 2000.Google Scholar
- 10.J. Han, J. Pei, Y. Yin. Mining frequent patterns without candidate generation. In Proc. of the ACM-SIGMOD Intl. Conf. on Management of Data.Google Scholar
- 11.N. Pasquier, Y Bastide. Discovering Frequent Closed Itemsets for Association Rules. In Proc. of the 7th Intl. Conf. on Database Theory, January 1999.Google Scholar
- 12.Z. Zheng, R. Kohavi, L. Mason. Real World Performance of Association Rule Algorithms. In Proc of Intl. Conf. on Knowledge Discovery and Data Mining, 2001.Google Scholar