Abstract
Association mining is the comprehensive identification of frequent patterns in discrete tabular data. The result of association mining can be a listing of hundreds to millions of patterns, of which few are likely of interest. In this paper we present a probabilistic metric to filter association rules that can help highlight the important structure in the data. The proposed filtering technique can be combined with maximal association mining algorithms or heuristic association mining algorithms to more efficiently search for interesting association rules with lower support.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. 20th Int. Conf. Very Large Data Bases, VLDB, pp. 12–15 (1994)
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: ACM SIGMOD Intl. Conference on Management of Data, pp. 207–216, 26–28 (1993)
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B 57(1), 300–389 (1995)
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998)
Shah, D., et al.: Interestingness and pruning of mined patterns. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (1999)
Toivonen, H., et al.: Pruning and grouping of discovered association rules (1995)
Klemettinen, M., et al.: Finding interesting rules from large sets of discovered association rules. In: CIKM 1994, pp. 401–407 (1994)
Bastide, Y., et al.: Mining minimal non-redundant association rules using frequent closed itemsets. In: Palamidessi, C., Moniz Pereira, L., Lloyd, J.W., Dahl, V., Furbach, U., Kerber, M., Lau, K.-K., Sagiv, Y., Stuckey, P.J., et al. (eds.) CL 2000. LNCS (LNAI), vol. 1861, p. 972. Springer, Heidelberg (2000)
Gouda, K., Zaki, M.J.: Efficiently mining maximal frequent itemsets. In: ICDM, pp. 163–170 (2001)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD Intl. Conference on Management of Data, June 05, pp. 1–12 (2000)
Hussain, F., Liu, H., Lu, H.: Relative measure for mining interesting rules. In: The Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases (2000)
Jaroszewicz, S., Simovici, D.A.: Pruning redundant association rules using maximum entropy principle. In: Advances in Knowledge Discovery and Data Mining, PAKDD, pp. 135–147 (2002)
Bayardo Jr, R.J.: Efficiently mining long patterns from databases. In: ACM SIGMOD Intl. Conference on Management of Data, pp. 85–93 (1998)
Liu, B., Hsu, W., Ma, Y.: Pruning and summarizing the discovered associations. In: Knowledge Discovery and Data Mining, pp. 125–134 (1999)
Pavlov, D., Mannila, H., Smyth, P.: Beyond independence: Probabilistic models for query approximation on binary transaction data. Technical Report UCI-ICS TR-01-09, UC Irvine (2001)
Ross, S.M.: Introduction to Probability Models. Academic Press, London (1972)
Silverstein, C., Brin, S., Motwani, R.: Beyond market baskets: Generalizing association rules to correlations. In: SIGMOD Conference, pp. 265–276 (1997)
Steeg, E., Robinson, D.A., Willis, E.: Coincidence detection: A fast method for discovering higher-order correlations in multidimensional data. In: KDD 1998, pp. 112–120 (1998)
Tan, P., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: ACM SIGKDD (2002)
Wu, T.: An accurate computation of the hypergeometric distribution function. ACM Transactions on Mathematical Software (TOMS) 19(1), 33–43 (1993)
Zaki, M.: Generating non-redundant association rules. In: KDD 2000, pp. 34–43 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ableson, A., Glasgow, J. (2003). Efficient Statistical Pruning of Association Rules. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds) Knowledge Discovery in Databases: PKDD 2003. PKDD 2003. Lecture Notes in Computer Science(), vol 2838. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39804-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-39804-2_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20085-7
Online ISBN: 978-3-540-39804-2
eBook Packages: Springer Book Archive