Simplest Rules Characterizing Classes Generated by δ-Free Sets

Crémilleux, Bruno; Boulicaut, Jean-François

doi:10.1007/978-1-4471-0651-7_3

Bruno Crémilleux⁴ &
Jean-François Boulicaut⁵

87 Accesses
11 Citations

Abstract

We present a new approach that provides the simplest rules characterizing classes with respect to their left-hand sides. This approach is based on a condensed representation (δ-free sets) of data which is efficiently computed. Produced rules have a minimal body (i.e. any subset of the left-hand side of a rule does not enable to conclude on the same class value). We show a sensible sufficient condition that avoids important classification conflicts. Experiments show that the number of rules characterizing classes drastically decreases. The technique is operational for large data sets and can be used even in the difficult context of highly-correlated data where other algorithms fail.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R. and Imielinski, T. and Swami, A. Mining association rules between sets of items in large databases, In Proceedings SIGMOD’93, ACM Press, pp. 207–216, 1993.
Google Scholar
Ali, K. and Manganaris, S. and Srikant, R. Partial classification using association rules, In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, KDD’97, AAAI Press, pp. 115–118, 1997.
Google Scholar
Bayardo, R.J., Brute-force mining of high-confidence classification rules, In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, KDD’97, AAAI Press, pp. 123–126, 1997.
Google Scholar
Bayardo, R.J. and Agrawal, R. and Gunopulos, D. Constraint-based rule mining in large, dense database, In Proceedings ICDE’99, pp. 188–197, 1999.
Google Scholar
Boulicaut, J.F. and Bykowski, A. Frequent closures as a concise representation for binary data mining, In Proceedings of the Fourth Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD’00, LNAI 1805, Springer-Verlag, pp. 62 – 73, Kyoto, Japan, 2000.
Google Scholar
Boulicaut, J.F. and Bykowski, A. and Rigotti, C. Approximation of frequency queries by means of free-sets, In Proceedings of the Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD’00, LNAI 1910, Springer-Verlag pp. 75–85, Lyon, France, 2000.
Google Scholar
Boulicaut, J.F. and Crémilleux, B. Delta-strong classification rules for characterizing chemical carcinogens, In Proceedings of the Predictive Toxicology Challenge for 2000-2001 co-located with PKDD’01, Freiburg, Germany, 2001.
Google Scholar
Boulicaut, J.F. and Crémilleux, B. Delta-strong classification rules for predicting collagen diseases, In Discovery Challenge on Thrombosis Data for 2000-2001 co-located with PKDD’01, pp. 29 – 38, Freiburg, Germany, 2001.
Google Scholar
Freitas, A.A. Understanding the crucial differences between classification and discovery of association rules - a position paper, In SIGKDD Explorations, Vol. 2(l), pp. 65–69, 2000.
Article MathSciNet Google Scholar
Helma, C. and Gottmann, E. and Kramer, S. Knowledge Discovery and data mining in toxicology Technical Report, University of Freiburg, 2000.
Google Scholar
Jovanoski, V. and Lavrac, N. Classification Rule with Apriori-C, In Proceedings of the Integrating Aspects of Data Mining, Decision Support and Meta Learning workshop, co-located with PKDD’01, 81 – 92, Freiburg, Germany, 2001.
Google Scholar
King, R.D. and Feng, C. and Sutherland, A. Statlog: Comparison of classification algorithms on large real-world problems, In Applied Artificial Intelligence, 1995.
Google Scholar
Liu, B. and Hsu, W. and Ma, Y. Integrating classification and association rules mining, In Proceedings of the Fourth International Conference on Knowledge Discovery & Data Mining, KKDD’98, AAAI Press, pp. 80–86, 1998.
Google Scholar
Li, W. and Han, J. and Pei, J. CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules, In Proceedings of the IEEE International Conference on Data Mining, ICDM’01, San Jose, California, 2001.
Google Scholar
Liu, B. and Ma, Y. and Wong, C. K. Classification using association rules: weaknesses and enhancements, In Data mining for scientific applications, Kumar, V. et al (eds), pp. 1–11, 2001.
Google Scholar
Li, J. and Shen, H. and Topor, R. Mining the Smallest Association Rule Set for Predictions, In Proceedings of the IEEE International Conference on Data Mining, ICDM’01, San Jose, California, 2001.
Google Scholar
Mannila, H. and Toivonen, H. Levelwise search and borders of theories in knowledge discovery In Data Mining and Knowledge Discovery, vol. 3(1), pp. 241–258, 1997.
Article Google Scholar
Mannila, H. and Toivonen, H. Multiple uses of frequent sets and condensed representations, In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD’96, pp. 189 – 194, Portland, Oregon, 1996.
Google Scholar
Pasquier, N. and Bastide, Y. and Taouil, R and Lakhal, L. Efficient mining of association rules using closed itemset lattices. In Information Systems 24(1), pp. 25–46. 1999.
Article Google Scholar
Quinlan, J.R. C4.5 Programs for machine learning Morgan Kaufmann, San Mateo, Californie, 1993.
Google Scholar
Schaffer, C. Overfitting avoidance as bias, In Machine Learning, vol. 10, pp. 153–178, 1993.
Google Scholar
Toivonen, H. Sampling large databases for association rules, In Proceedings of the 22nd International Conference on Very Large Databases, VLDB’96, Morgan Kaufmann, pp. 134–145, 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

GREYC CNRS-UMR 6072, University of Caen, F-14032, Caen Cedex, France
Bruno Crémilleux
Laboratoire d’Ingénierie des Systémes d’Information, INSA Lyon, F-69621, Villeurbanne Cedex, France
Jean-François Boulicaut

Authors

Bruno Crémilleux
View author publications
You can also search for this author in PubMed Google Scholar
Jean-François Boulicaut
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Technology, University of Portsmouth, Portsmouth, UK
Max Bramer BSc, PhD, CEng, FBCS, FIEE, FRSA (Technical Programme Chair) (Technical Programme Chair)
Dept of Computer Science, University of Aberdeen, Aberdeen, UK
Alun Preece (Deputy Technical Programme Chair) (Deputy Technical Programme Chair)
Department of Computer Science, University of Liverpool, Liverpool, UK
Frans Coenen (Conference Chairman) (Conference Chairman)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Crémilleux, B., Boulicaut, JF. (2003). Simplest Rules Characterizing Classes Generated by δ-Free Sets. In: Bramer, M., Preece, A., Coenen, F. (eds) Research and Development in Intelligent Systems XIX. Springer, London. https://doi.org/10.1007/978-1-4471-0651-7_3

Download citation

DOI: https://doi.org/10.1007/978-1-4471-0651-7_3
Publisher Name: Springer, London
Print ISBN: 978-1-85233-674-5
Online ISBN: 978-1-4471-0651-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics