Skip to main content

Simplest Rules Characterizing Classes Generated by δ-Free Sets

  • Conference paper
Research and Development in Intelligent Systems XIX

Abstract

We present a new approach that provides the simplest rules characterizing classes with respect to their left-hand sides. This approach is based on a condensed representation (δ-free sets) of data which is efficiently computed. Produced rules have a minimal body (i.e. any subset of the left-hand side of a rule does not enable to conclude on the same class value). We show a sensible sufficient condition that avoids important classification conflicts. Experiments show that the number of rules characterizing classes drastically decreases. The technique is operational for large data sets and can be used even in the difficult context of highly-correlated data where other algorithms fail.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R. and Imielinski, T. and Swami, A. Mining association rules between sets of items in large databases, In Proceedings SIGMOD’93, ACM Press, pp. 207–216, 1993.

    Google Scholar 

  2. Ali, K. and Manganaris, S. and Srikant, R. Partial classification using association rules, In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, KDD’97, AAAI Press, pp. 115–118, 1997.

    Google Scholar 

  3. Bayardo, R.J., Brute-force mining of high-confidence classification rules, In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, KDD’97, AAAI Press, pp. 123–126, 1997.

    Google Scholar 

  4. Bayardo, R.J. and Agrawal, R. and Gunopulos, D. Constraint-based rule mining in large, dense database, In Proceedings ICDE’99, pp. 188–197, 1999.

    Google Scholar 

  5. Boulicaut, J.F. and Bykowski, A. Frequent closures as a concise representation for binary data mining, In Proceedings of the Fourth Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD’00, LNAI 1805, Springer-Verlag, pp. 62 – 73, Kyoto, Japan, 2000.

    Google Scholar 

  6. Boulicaut, J.F. and Bykowski, A. and Rigotti, C. Approximation of frequency queries by means of free-sets, In Proceedings of the Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD’00, LNAI 1910, Springer-Verlag pp. 75–85, Lyon, France, 2000.

    Google Scholar 

  7. Boulicaut, J.F. and Crémilleux, B. Delta-strong classification rules for characterizing chemical carcinogens, In Proceedings of the Predictive Toxicology Challenge for 2000-2001 co-located with PKDD’01, Freiburg, Germany, 2001.

    Google Scholar 

  8. Boulicaut, J.F. and Crémilleux, B. Delta-strong classification rules for predicting collagen diseases, In Discovery Challenge on Thrombosis Data for 2000-2001 co-located with PKDD’01, pp. 29 – 38, Freiburg, Germany, 2001.

    Google Scholar 

  9. Freitas, A.A. Understanding the crucial differences between classification and discovery of association rules - a position paper, In SIGKDD Explorations, Vol. 2(l), pp. 65–69, 2000.

    Article  MathSciNet  Google Scholar 

  10. Helma, C. and Gottmann, E. and Kramer, S. Knowledge Discovery and data mining in toxicology Technical Report, University of Freiburg, 2000.

    Google Scholar 

  11. Jovanoski, V. and Lavrac, N. Classification Rule with Apriori-C, In Proceedings of the Integrating Aspects of Data Mining, Decision Support and Meta Learning workshop, co-located with PKDD’01, 81 – 92, Freiburg, Germany, 2001.

    Google Scholar 

  12. King, R.D. and Feng, C. and Sutherland, A. Statlog: Comparison of classification algorithms on large real-world problems, In Applied Artificial Intelligence, 1995.

    Google Scholar 

  13. Liu, B. and Hsu, W. and Ma, Y. Integrating classification and association rules mining, In Proceedings of the Fourth International Conference on Knowledge Discovery & Data Mining, KKDD’98, AAAI Press, pp. 80–86, 1998.

    Google Scholar 

  14. Li, W. and Han, J. and Pei, J. CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules, In Proceedings of the IEEE International Conference on Data Mining, ICDM’01, San Jose, California, 2001.

    Google Scholar 

  15. Liu, B. and Ma, Y. and Wong, C. K. Classification using association rules: weaknesses and enhancements, In Data mining for scientific applications, Kumar, V. et al (eds), pp. 1–11, 2001.

    Google Scholar 

  16. Li, J. and Shen, H. and Topor, R. Mining the Smallest Association Rule Set for Predictions, In Proceedings of the IEEE International Conference on Data Mining, ICDM’01, San Jose, California, 2001.

    Google Scholar 

  17. Mannila, H. and Toivonen, H. Levelwise search and borders of theories in knowledge discovery In Data Mining and Knowledge Discovery, vol. 3(1), pp. 241–258, 1997.

    Article  Google Scholar 

  18. Mannila, H. and Toivonen, H. Multiple uses of frequent sets and condensed representations, In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD’96, pp. 189 – 194, Portland, Oregon, 1996.

    Google Scholar 

  19. Pasquier, N. and Bastide, Y. and Taouil, R and Lakhal, L. Efficient mining of association rules using closed itemset lattices. In Information Systems 24(1), pp. 25–46. 1999.

    Article  Google Scholar 

  20. Quinlan, J.R. C4.5 Programs for machine learning Morgan Kaufmann, San Mateo, Californie, 1993.

    Google Scholar 

  21. Schaffer, C. Overfitting avoidance as bias, In Machine Learning, vol. 10, pp. 153–178, 1993.

    Google Scholar 

  22. Toivonen, H. Sampling large databases for association rules, In Proceedings of the 22nd International Conference on Very Large Databases, VLDB’96, Morgan Kaufmann, pp. 134–145, 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag London Limited

About this paper

Cite this paper

Crémilleux, B., Boulicaut, JF. (2003). Simplest Rules Characterizing Classes Generated by δ-Free Sets. In: Bramer, M., Preece, A., Coenen, F. (eds) Research and Development in Intelligent Systems XIX. Springer, London. https://doi.org/10.1007/978-1-4471-0651-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-0651-7_3

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-85233-674-5

  • Online ISBN: 978-1-4471-0651-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics