Skip to main content

Discovering Association Patterns Based on Mutual Information

  • Conference paper
  • First Online:
Machine Learning and Data Mining in Pattern Recognition (MLDM 2003)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2734))

Abstract

Identifying and expressing data patterns in form of association rules is a commonly used technique in data mining. Typically, association rules discovery is based on two criteria: support and confidence. In this paper we will briefly discuss the insufficiency on these two criteria, and argue the importance of including interestingness/dependency as a criterion for (association) pattern discovery. From the practical computational perspective, we will show how the proposed criterion grounded on interestingness could be used to improve the efficiency of pattern discovery mechanism. Furthermore, we will show a probabilistic inference mechanism that provides an alternative to pattern discovery. Example illustration and preliminary study for evaluating the proposed approach will be presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Genesereth M., Nilsson N.: Logical Foundations of Artificial Intelligence. Morgan Kaufmann (1987)

    Google Scholar 

  2. Freedman, D.: From association to causation: Some remarks on the history of statistics. Statistical Science 14 Vol. 3 (1999) 243–258

    Article  Google Scholar 

  3. Cover T.M., Thomas J.A.: Elements of Information Theory. New York: John Wiley & Sons (1991)

    MATH  Google Scholar 

  4. Rish I., Hellerstein J., Jayram T.: An Analysis of Data Characteristics that affect Naive Bayes Performance. Tec. Rep. RC21993, IBM Watson Research Center (2001)

    Google Scholar 

  5. Barber B., Hamilton H.J.: Extracting Share Frequent Itemsets with Infrequent Subsets. Data Mining and Knowledge Discovery. (2003) 7(2):153–168

    Article  MathSciNet  Google Scholar 

  6. Yang J., Wang W., Yu P.S., Han J.: Mining Long Sequential Patterns in a Noisy Environment. ACM SIGMOD June 4–6, Madison, Wisconsin (2002) 406–417

    Google Scholar 

  7. Kullback S.: Information Theory and Statistics. John Wiley & Sons Inc (1959)

    Google Scholar 

  8. Basharin G.: Theory of Probability and its Applications. Vol. 4 (1959) 333–336

    Article  MathSciNet  Google Scholar 

  9. Silverstein C., Brin S., Motwani R.: Beyond Market Baskets: Generalizaing Association Rules to Dependence Rules. Data Mining and Knowledge Discovery. (1998) 2(1):39–68

    Article  Google Scholar 

  10. Agrawal R., Imielinski T., Swami A.: Mining Association Rules between Sets of Items in large Databases. Proc. ACM SIGMOD Conf. Washington DC, May (1993)

    Google Scholar 

  11. Agrawal R., Srikant R.: Fast Algorithms for Mining Association Rules. VLDDBB (1994) 487–499

    Google Scholar 

  12. Toivonen H.: Sampling Large Databases for Association Rules. Proc. 22nd VLDB (1996) 134–145

    Google Scholar 

  13. Sy B.K.: Probability Model Selection Using Information-Theoretic Optimization Criterion. J. of Statistical Computing & Simulation, Gordan & Breach. V69-3 (2001)

    Google Scholar 

  14. Hoeffding W.: Probability Inequalities for sums of bounded Random Variables. Journal of the American Statistical Associations. Vol. 58 (1963) 13–30

    Article  MATH  MathSciNet  Google Scholar 

  15. Zaki M.: SPADE: an efficient algorithm for Mining Frequent Sequences. Machine Learning Journal, Vol. 42?1/2 (2001) 31–60

    Article  Google Scholar 

  16. http://davis.wpi.edu/~xmdv/datasets.html

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sy, B.K. (2003). Discovering Association Patterns Based on Mutual Information. In: Perner, P., Rosenfeld, A. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2003. Lecture Notes in Computer Science, vol 2734. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45065-3_32

Download citation

  • DOI: https://doi.org/10.1007/3-540-45065-3_32

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40504-7

  • Online ISBN: 978-3-540-45065-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics