Abstract
When evaluating association rules, rules that differ in both support and confidence have to compared; a larger support has to be traded against a higher confidence. The solution which we propose for this problem is to maximize the expected accuracy that the association rule will have for future data. In a Bayesian framework, we determine the contributions of confidence and support to the expected accuracy on future data. We present a fast algorithm that finds the n best rules which maximize the resulting criterion. The algorithm dynamically prunes redundant rules and parts of the hypothesis space that cannot contain better solutions than the best ones found so far. We evaluate the performance of the algorithm (relative to the Apriori algorithm) on realistic knowledge discovery problems.
Chapter PDF
References
R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In ACM SIGMOD Conference on Management of Data, pages 207–216, 1993.
R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. Verkamo. Fast discovery of association rules. In Advances in Knowledge Discovery and Data Mining, 1996.
S. Brin, R. Motwani, J. Ullmann, and S. Tsur. Dynamic itemset counting and implication rules for market basket data. In Proceedings of the ACM SIGMOD Conference on Managament of Data, 1997.
W. Gilks, S. Richardson, and D. Spiegelhalter, editors. Markov Chain Monte Carlo in Practice. Chapman & Hall, 1995.
M. Klemettinen, H. Mannila, P. Ronkainen, H. Toivonen, and A. I. Verkamo. Finding interesting rules from large sets of discovered associacion rules. Proc. Third International Conference on Information and Knowledge Management, 1994.
J. Langford and D. McAllester. Computable shell decomposition bounds. In Proceedings of the International Conference on Computational Learning Theory, 2000.
D. Lin and Z. Kedem. Pincer search: a new algorithm for discovering the maximum frequent set. In Proceedings of the International Conference on Extending Database Technology, 1998.
T. Scheffer. Error Estimation and Model Selection. Infix Publisher, Sankt Augustin, 1999.
T. Scheffer. Average-case analysis of classification algorithms for boolean functions and decision trees. In Proceedings of the International Conference on Algorithmic Learning Theory, 2000.
T. Scheffer. Nonparametric regularization of decision trees. In Proceedings of the European Conference on Machine Learning, 2000.
T. Scheffer and S. Wrobel. A sequential sampling algorithm for a general class of utility functions. In Proceedings of the International Conference on Knowledge Discovery and Data Mining, 2000.
S. Wrobel. Inductive logic programming for knowledge discovery in databases. In Sašo Džeroski and Nada Lavrač, editors, Relational Data Mining, 2001.
M. Zaki. Generating non-redundant association rules. In Proceedings of the International Conference on Knowledge Discovery and Data Mining, 2000.
M. Zaki and C. Hiao. Charm: an efficient algorithm for closed association rule mining. Technical Report 99-10, Rensselaer Polytechnic Institute, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Scheffer, T. (2001). Finding Association Rules That Trade Support Optimally against Confidence. In: De Raedt, L., Siebes, A. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2001. Lecture Notes in Computer Science(), vol 2168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44794-6_35
Download citation
DOI: https://doi.org/10.1007/3-540-44794-6_35
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42534-2
Online ISBN: 978-3-540-44794-8
eBook Packages: Springer Book Archive