A Theoretical Framework for Association Mining based on the Boolean Retrieval Model

Bollmann-Sdorra, Peter; Hafez, Alaaeldin M.; Raghavan, Vijay V.

doi:10.1007/3-540-44801-2_3

Peter Bollmann-Sdorra⁷,
Alaaeldin M. Hafez⁸ &
Vijay V. Raghavan⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2114))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

896 Accesses
7 Citations

Abstract

Data mining has been defined as the non- trivial extraction of implicit, previously unknown and potentially useful information from data. Association mining is one of the important sub-fields in data mining, where rules that imply certain association relationships among a set of items in a transaction database are discovered.

The efforts of most researchers focus on discovering rules in the form of implications between itemsets, which are subsets of items that have adequate supports. Having itemsets as both antecedent and precedent parts was motivated by the original application pertaining to market baskets and they represent only the simplest form of predicates. This simplicity is also due in part to the lack of a theoretical framework that includes more expressive predicates.

The framework we develop derives from the observation that information retrieval and association mining are two complementary processes on the same data records or transactions. In information retrieval, given a query, we need to find the subset of records that matches the query. In contrast, in data mining, we need to find the queries (rules) having adequate number of records that support them.

In this paper we introduce the theory of association mining that is based on a model of retrieval known as the Boolean Retrieval Model. The potential implications of the proposed theory are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

R. Agrawal, T. Imilienski, and A. Swami, “Mining Association Rules between Sets of Items in Large Databases,” Proc. of the ACM SIGMOD Int’l Conf. On Management of data, May 1993
Google Scholar
R. Agrawal, and R. Srikant, “Fast Algorithms for Mining Association Rules,” Proc. Of the 20th VLDB Conference, Santiago, Chile, 1994.
Google Scholar
R. Agrawal, and P. Yu, “Parallel Mining of Association Rules,” IEEE Transactions on Knowledge and Data Engineering, Vol. 8, No, 6, Dec. 1996.
Google Scholar
C. Agrawal, and P. Yu, “Mining Large Itemsets for Association Rules,” Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 1997.
Google Scholar
A. Bookstein and W. Cooper, “A General Mathematical Model for Information Retrieval Systems,” Library Quarterly, 46(2), 153–167.
Google Scholar
S. Brin, R. Motwani, J. Ullman, and S. Tsur, “Dynamic Itemset Counting and Implication Rules for Market Basket Data,” SIGMOD Record (SCM Special Interset Group on Management of Data), 26,2, 1997.
Google Scholar
S. Chaudhuri, “Data Mining and Database Systems: Where is the Intersection,”Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 1997.
Google Scholar
U. Fayyad, “Mining Databases: Towards Algorithms for Knowledge Discovery,”Data Engineering Bulletin 21(1): 39–48 (1998)
Google Scholar
Fayyad: Editorial. Data Mining and Knowledge Discovery 2(1): 5–7 (1998)
Article Google Scholar
U. Fayyad, “Data Mining and Knowledge Discovery in Databases: Implications for Scientific Databases,” SSDBM 1997: 2–11
Google Scholar
H. Mannila, “Inductive databases and condensed representations for data mining,” Proc. International Logic Programming Symposium, 1997, pp. 21–30.
Google Scholar
H. Mannila, “Methods and problems in data mining,” In Proceedings of the International Conference on Database Theory, Delphi, Greece, January 1997. Springer-Verlag.
Google Scholar
H. Mannila, “Data mining: machine learning, statistics, and databases,” Eight International Conference on Scientific and Statistical Database Management, Stockholm. 1996.
Google Scholar
H. Mannila, H. Toivonen, and A. Verkamo, “Efficient Algorithms for Discovering Association Rules,” AAAI Workshop on Knowledge Discovery in databases (KDD-94) , July 1994.
Google Scholar
A. Netz, S. Chaudhuri, J. Bernhardt and U. Fayyad, “Integration of Data Mining with Database Technology,” VLDB 2000: 719–722
Google Scholar
S. E. Robertson, “Theories and Models in Information Retrieval,” Journal of Documentation, 33, 126–149.
Google Scholar
G. Salton, Automatic Text Processing, Reading, MA: Addison Wesley.
Google Scholar
M. Zaki and M. Ogihara, “Theoretical foundations of association rules,” In 3rd ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, June 1998.
Google Scholar
M. Zaki, S. Parthasarathy, M. Ogihara, and W. Li, “ New Algorithms for Fast Discovery of Association Rules,”Proc. Ofthe 3rd Int’l Conf. On Knowledge Discovery and data Mining (KDD-97), AAAI Press, 1997
Google Scholar

Download references

Author information

Authors and Affiliations

Fachbereich Informatik, FR 6-9, Technical University of Berlin, Franklin Str. 28/29, 10857, Berlin
Peter Bollmann-Sdorra
The Center for Advanced Computer Studies, University of Louisiana, Lafayette, LA70504-4330, USA
Alaaeldin M. Hafez & Vijay V. Raghavan

Authors

Peter Bollmann-Sdorra
View author publications
You can also search for this author in PubMed Google Scholar
Alaaeldin M. Hafez
View author publications
You can also search for this author in PubMed Google Scholar
Vijay V. Raghavan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Kyoto University, Kyoto, 606-8501, Japan
Yahiko Kambayashi
EC3, Siebensterngasse 21/3, 1070, Wien
Werner Winiwarter
Center for Spatial Information Science (CSIS), University of Tokyo, 4-6-1, Komaba Meguro-ku, Tokyo, 153-8904, Japan
Masatoshi Arikawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bollmann-Sdorra, P., Hafez, A.M., Raghavan, V.V. (2001). A Theoretical Framework for Association Mining based on the Boolean Retrieval Model. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2001. Lecture Notes in Computer Science, vol 2114. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44801-2_3

Download citation

DOI: https://doi.org/10.1007/3-540-44801-2_3
Published: 28 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42553-3
Online ISBN: 978-3-540-44801-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics