Abstract
Association rules are a key data-mining tool and as such have been well researched. So far, this research has focused predominantly on databases containing categorical data only. However, many real-world databases contain quantitative attributes and current solutions for this case are so far inadequate. In this paper we introduce a new definition of quantitative association rules based on statistical inference theory. Our definition reflects the intuition that the goal of association rules is to find extraordinary and therefore interesting phenomena in databases. We also introduce the concept of sub-rules which can be applied to any type of association rule. Rigorous experimental evaluation on real-world datasets is presented, demonstrating the usefulness and characteristics of rules mined according to our definition.
Similar content being viewed by others
References
Agrawal, R., Imielinski, T., and Swami, A. (1993). Mining Association Rules between Sets of Items in Large Databases. In Proc. of the 1993 ACM SIGMOD Intl. Conference on Management of Data (pp. 207–216).
Agrawal, R. and Srikant, R. (1994). Fast Algorithms for Mining Association Rules in Large Databases. In Proc. of the 20th Intl. Conference on VLDB.
Brin, S., Motwani, R., and Silverstein, C. (1997). Beyond Market Baskets: Generalizing Association Rules to Correlations. In Proc. of the 1997 ACM SIGMOD Conference on Management of Data.
Fukuda, T., Morimoto, Y., Morishita, S., and Tokuyama, T. (1996). Data Mining Using Two-Dimensional Optimized Association Rules: Scheme, Algorithms and Visualization. In Proc. of the 1996 ACM SIGMOD Conference on Management of Data.
Fukuda, T., Morimoto, Y., Morishita, S., and Tokuyama, T. (1999). Mining Optimized Association Rules for Numeric Attributes. Journal of Computer and System Sciences, 58, 1–12.
Jensen, D. and Cohen, P. (2000). Multiple Comparisons in Induction Algorithms. Journal of Machine Learning, 38, 309–338.
Kloesgen,W. (1994). Exploration of Simulation Experiments by Discovery. In Proceedings of KDD-94Workshop, AAAI-94. Further information may be found at the explora homepage: http://orgwis.gmd.de/projects/explora.
Lindgren, B.W. (1976). Statistical Theory. New York: Macmillan Publishing Co., Inc.
Mannila, H., Toivonen, H., and Verkamo, A.I. (1994). Efficient Algorithms for Discovering Association Rules. KDD-94: AAAI Workshop on Knowledge Discovery in Databases (pp. 181–192).
Morimoto, Y., Ishii, H., and Morishita, S. (2001). Efficient Construction of Regression Trees with Range and Region Splitting. Machine Learning, 45, 235–259.
Piatetsky-Shapiro, G. (1991). Discovery, Analysis, and Presentation of Strong Rules. In Knowledge Discovey in Databases (pp. 229–248).
Srikant, R. and Agrawal, R. (1996). Mining Quantitative Association Rules in Large Relational Tables. In Proc. of the ACM SIGMOD Conference on Management of Data.
Toivonen, H. (1996). Sampling Large Databases for Association Rules. In Proc. of the 22nd VLDB Conference.
Yoda, K., Fukuda, T., Morimoto, Y., Morishita, S., and Tokuyama, T. (1997) Computing Optimized Rectilinear Regions for Association Rules. In Proc. of KDD' 97.
Zhang, Z., Lu, Y., and Zhang, B. (1997). An Effective Partitioning-Combining Algorithm for Discovering Quantitative Association Rules. In Proc. of the First Pacific-Asia Conference on Knowledge Discovery and Data Mining.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Aumann, Y., Lindell, Y. A Statistical Theory for Quantitative Association Rules. Journal of Intelligent Information Systems 20, 255–283 (2003). https://doi.org/10.1023/A:1022812808206
Issue Date:
DOI: https://doi.org/10.1023/A:1022812808206