Abstract
Many algorithms have been proposed for computing association rules using the support-confidence framework. One drawback of this framework is its weakness in expressing the notion of correlation. We propose an efficient algorithm for mining association rules that uses statistical metrics to determine correlation. The simple application of conventional techniques developed for the support-confidence framework is not possible, since functions for correlation do not meet the antimonotonicity property that is crucial to traditional methods. In this paper, we propose the heuristics for the vertical decomposition of a database, for pruning unproductive itemsets, and for traversing a setenumeration tree of itemsets that is tailored to the calculation of the N most significant association rules, where N can be specified by the user. We experimentally compared the combination of these three techniques with the previous statistical approach. Our tests confirmed that the computational performance improves by several orders of magnitude.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
C. C. Aggarwal and P. S. Yu. A new framework for itemset generation. In Proc. of PODS’98, pp. 18–24, June 1998.
R. Agrawal, T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large databases. In Proc. of SIGMOD’93, pp. 207–216, May 1993.
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. of VLDB’94, pp. 487–499, Sept. 1994.
R. J. Bayardo Jr. Efficiently mining long patterns from databases. In Proc. of SIGMOD’98, pp. 85–93, June 1998.
R. J. Bayardo and R. Agrawal Mining the most interesting rules. In Proc. of SIGKDD’99, pp. 145–153, Aug. 1999.
S. Brin, R. Motwani, and C. Silverstein. Beyond market baskets: Generalizing association rules to correlations. In Proc. of SIGMOD’97, pp. 265–276, May 1997.
S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. Dynamic itemset counting and implication rules for market basket analysis. In Proc. of SIGMOD’97, pp. 265–276, May 1997.
B. Dunkel and N. Soparkar. Data organization and access for efficient data mining. In Proc. of ICDE’99, pp. 522–529, March 1999.
J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proc. of SIGMOD’00, pp. 1–12, May 2000.
B. Liu, W. Hsu, and Y. Ma. Pruning and summarizing the discovered associations. In Proc. of SIGKDD’99, pp. 125–134, 1999.
S. Morishita and J. Sese. Traversing lattice itemset with statistical metric pruning. In Proc. of PODS’00, pp. 226–236, May 2000.
A. Nakaya, H. Hishigaki, and S. Morishita. Mining the quantitative trait loci associated with oral glucose tolerance in the OLETF rat. In Proc. of Pacific Symposium on Biocomputing, pp. 367–379, Jan. 2000.
J. S. Park, M. S. Chen, and P. S. Yu. An effective hash-based algorithm for mining association rules. In Proc. of SIGMOD’95, pp. 175–186, May 1995.
R. Rymon. Search through systematic set enumeration. In Proc. of KR’92, pp. 539–550, 1992.
P. Shenoy, J. R. Haritsa, S. Sudarshan, G. Bhalotia, M. Bawa, and D. Shah. Turbocharging vertical mining of large databases. In Proc. of SIGMOD’00, pp. 22–33, May 2000.
G. I. Webb. Efficient search for association rules. In Proc. of SIGKDD’00, pp. 99–107, Aug. 2000.
M. J. Zaki. Generating non-redundant association rules. In Proc. of SIGKDD’00, pp. 34–43, Aug. 2000.
M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New algorithms for fast discovery of association rules. In Proc. of KDD’97, pp. 343–374, Aug. 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sese, J., Morishita, S. (2002). Answering the Most Correlated N Association Rules Efficiently. In: Elomaa, T., Mannila, H., Toivonen, H. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2002. Lecture Notes in Computer Science, vol 2431. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45681-3_34
Download citation
DOI: https://doi.org/10.1007/3-540-45681-3_34
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44037-6
Online ISBN: 978-3-540-45681-0
eBook Packages: Springer Book Archive