Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Frequent Itemsets and Association Rules

  • Hong Cheng
  • Jiawei Han
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_171

Synonyms

Frequent patterns; Large itemsets

Definition

Let I = {i1, i2, … , in} be a set of items, and DB = {T1, T2, … , Tm} be a transaction database, where Ti(i ∈ [1 … m]) is a transaction containing a set of items in I. The support (or occurrence frequency) of an itemset A, where A is a set of items from I, is the number of transactions containing A in DB. An itemset A is frequent if A’s support is no less than a user-specified minimum support threshold θ. An itemset A which contains k items is called a k-itemset.

Historical Background

Frequent itemset mining was first proposed by Agrawal et al. [2] for market basket analysis in the context of association rule mining. It analyzes customer buying habits by finding associations between the different items that customers place in their “shopping baskets.” For instance, if customers are buying milk, how likely are they going to also buy cereal (and what kind of cereal) on the same trip to the supermarket? Such information can lead to...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Agrawal R, Gehrke J, Gunopulos D, Raghavan P. Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1998. p. 94–105.Google Scholar
  2. 2.
    Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1993. p. 207–16.Google Scholar
  3. 3.
    Agrawal R, Srikant R. Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases; 1994. p. 487–99.Google Scholar
  4. 4.
    Brin S, Motwani R, Ullman JD, Tsur S. Dynamic itemset counting and implication rules for market basket analysis. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1997. p. 255–64.Google Scholar
  5. 5.
    Cheng C-H, Fu AW, Zhang Y. Entropy-based subspace clustering for mining numerical data. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 1999. p. 84–93.Google Scholar
  6. 6.
    Cheng H, Yan X, Han J, Hsu C. Discriminative frequent pattern analysis for effective classification. In: Proceedings of the 23rd International Conference on Data Engineering; 2007. p. 716–25.Google Scholar
  7. 7.
    Cong G., Tan K-L, Tung AKH, Xu X. Mining top-k covering rule groups for gene expression data. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2005. p. 670–81.Google Scholar
  8. 8.
    Eirinaki M, Vazirgiannis M. Web mining for web personalization. ACM Trans Internet Technol. 2003;3(1):1–27.CrossRefGoogle Scholar
  9. 9.
    Goethals B, Zaki M. An introduction to workshop on frequent itemset mining implementations. In: Proceedings of the ICDM International Workshop on Frequent Itemset Mining Implementations; 2003.p. 1–13.Google Scholar
  10. 10.
    Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2000. p. 1–12.CrossRefGoogle Scholar
  11. 11.
    Li W, Han J, Pei J. CMAR: Accurate and efficient classification based on multiple class-association rules. In: Proceedings of the 1st IEEE International Conference on Data Mining; 2001. p. 369-376.Google Scholar
  12. 12.
    Li Z, Zhou Y. PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code. In: Proceedings of the ACM SIGSOFT Symposium on Foundations Software Engineering; 2005. p. 306–15.CrossRefGoogle Scholar
  13. 13.
    Liu B., Hsu W, Ma Y. Integrating classification and association rule mining. In: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining; 1998. p. 80–6.Google Scholar
  14. 14.
    Park JS, Chen MS, Yu PS. An effective hash-based algorithm for mining association rules. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1995. p. 175–86.Google Scholar
  15. 15.
    Pei J, Han J, Mortazavi B-A, Zhu H. Mining access patterns efficiently from web logs. In: Advances in Knowledge Discovery and Data Mining, 4th Pacific-Asia Conference; 2000. p. 396–407.Google Scholar
  16. 16.
    Savasere A., Omiecinski E., Navathe S. An efficient algorithm for mining association rules in large databases. In: Proceedings of the 21th International Conference on Very Large Data Bases; 1995. p. 432–43.Google Scholar
  17. 17.
    Srivastava J, Cooley R, Deshpande M, Tan P. Web usage mining: discovery and applications of usage patterns from web data. SIGKDD Explor. 2000;2(1):12–23.CrossRefGoogle Scholar
  18. 18.
    Toivonen H. Sampling large databases for association rules. In: Proceedings of the 22nd International Conference on Very Large Data Bases; 1996. p. 134–45.Google Scholar
  19. 19.
    Wang H, Wang W, Yang J, Yu PS. Clustering by pattern similarity in large data sets. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2002. p. 418–27.Google Scholar
  20. 20.
    Zaki MJ. Scalable algorithms for association mining. IEEE Trans Knowl Data Eng. 2000;12(3):372–90.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Systems Engineering and Engineering ManagementThe Chinese University of Hong KongHong KongChina
  2. 2.University of Illinois at Urbana-ChampaignUrbanaUSA

Section editors and affiliations

  • Jian Pei
    • 1
  1. 1.School of Computing ScienceSimon Fraser Univ.BurnabyCanada