Abstract
Association rules describe the degree of dependence between items in transactional datasets by their confidences. In this paper, we first introduce the problem of mining top rules, namely those association rules with 100% confidence. Traditional approaches to this problem need a minimum support (minsup) threshold and then can discover the top rules with supports ≥ minsup; such approaches, however, rely on minsup to help avoid examining too many candidates and they miss those top rules whose supports are below minsup. The low support top rules (e.g. some unusual combinations of some factors that have always caused some disease) may be very interesting. Fundamentally different from previous work, our proposed method uses a dataset partitioning technique and two border-based algorithms to efficiently discover all top rules with a given consequent, without the constraint of support threshold. Importantly, we use borders to concisely represent all top rules, instead of enumerating them individually. We also discuss how to discover all zero-confidence rules and some very high (say 90%) confidence rules using approaches similar to mining top rules. Experimental results using the Mushroom, the Cleveland heart disease, and the Boston housing datasets are reported to evaluate the efficiency of the proposed approach.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. 1993 ACM-SIGMOD Int. Conf. Management of Data, Washington, D.C, pp. 207–216 (1993)
Bayardo, R.J.: Brute-force mining of high-confidence classification rules. In: Proc. of the Third Int’l Conf. on Knowledge Discovery and Data Mining, pp. 123–126 (1997)
Dong, G., Li, J.: Interestingness of discovered association rules in terms of neighborhoodbased unexpectedness. In: Wu, X., Kotagiri, R., Korb, K.B. (eds.) PAKDD 1998. LNCS, vol. 1394, pp. 72–86. Springer, Heidelberg (1998)
Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In: SIGKDD 1999, San Diego (1999) (to appear)
Dong, G., Li, J., Zhang, X.: Discovering jumping emerging patterns and experiments on real datasets. In: Proc. the 9th International Database Conference (IDC 1999), Hong Kong (1999) (to appear)
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: IJCAI 1993, pp. 1022–1027 (1993)
Kohavi, R., John, G., Long, R., Manley, D., Pfleger, K.: MLC++ a machine learning library in C++. Tools with artificial intelligence, 740–743 (1994)
Murphy, P.M., Aha, D.W.: UCI Repository of machine learning database Irvine, CA: University of California, Department of Information and Computer Science (1994), http://www.cs.uci.edu/~mlearn/mlrepository.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, J., Zhang, X., Dong, G., Ramamohanarao, K., Sun, Q. (1999). Efficient Mining of High Confidence Association Rules without Support Thresholds. In: Żytkow, J.M., Rauch, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1999. Lecture Notes in Computer Science(), vol 1704. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-48247-5_50
Download citation
DOI: https://doi.org/10.1007/978-3-540-48247-5_50
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66490-1
Online ISBN: 978-3-540-48247-5
eBook Packages: Springer Book Archive