Skip to main content

Shorter Rules Are Better, Aren’t They?

  • Conference paper
  • First Online:
Book cover Discovery Science (DS 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9956))

Included in the following conference series:

Abstract

It is conventional wisdom in inductive rule learning that shorter rules should be preferred over longer rules, a principle also known as Occam’s Razor. This is typically justified with the fact that longer rules tend to be more specific and are therefore also more likely to overfit the data. In this position paper, we would like to challenge this assumption by demonstrating that variants of conventional rule learning heuristics, so-called inverted heuristics, learn longer rules that are not more specific than the shorter rules learned by conventional heuristics. Moreover, we will argue with some examples that such longer rules may in many cases be more understandable than shorter rules, again contradicting a widely held view. This is not only relevant for subgroup discovery but also for related concepts like characteristic rules, formal concept analysis, or closed itemsets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Entities should not be multiplied beyond necessity.

References

  1. Bensusan, H.: God doesn’t always shave with Occam’s Razor - learning when and how to prune. In: Nédellec, C., Rouveirol, C. (eds.) Proceedings of the 10th European Conference on Machine Learning (ECML 1998), pp. 119–124 (1998)

    Google Scholar 

  2. Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Occam’s Razor. Inf. Process. Lett. 24, 377–380 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  3. Domingos, P.: The role of Occam’s Razor in knowledge discovery. Data Min. Knowl. Discovery 3(4), 409–425 (1999)

    Article  Google Scholar 

  4. Fürnkranz, J.: Separate-and-conquer rule learning. Artif. Intell. Rev. 13(1), 3–54 (1999)

    Article  MATH  Google Scholar 

  5. Fürnkranz, J., Flach, P.A.: ROC ’n’ rule learning - towards a better understanding of covering algorithms. Mach. Learn. 58(1), 39–77 (2005)

    Article  MATH  Google Scholar 

  6. Fürnkranz, J., Gamberger, D., Lavrač, N.: Foundations of Rule Learning. Springer, Heidelberg (2012)

    Book  MATH  Google Scholar 

  7. Gamberger, D., Lavrač, N.: Active subgroup mining: a case study in coronary heart disease risk group detection. Artif. Intell. Med. 28(1), 27–57 (2003)

    Article  Google Scholar 

  8. Ganter, B., Wille, R.: Formal Concept Analysis - Mathematical Foundations. Springer, Heidelberg (1999)

    Book  MATH  Google Scholar 

  9. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)

    Article  Google Scholar 

  10. Janssen, F., Fürnkranz, J.: On the quest for optimal rule learning heuristics. Mach. Learn. 78(3), 343–379 (2010)

    Article  MathSciNet  Google Scholar 

  11. Kralj, P., Lavrač, N., Gamberger, D., Krstačić, A.: Contrast set mining through subgroup discovery applied to brain ischaemina data. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2016. LNCS (LNAI), vol. 4426, pp. 579–586. Springer, Heidelberg (2007). doi:10.1007/978-3-540-71701-0_61

    Chapter  Google Scholar 

  12. Kralj Novak, P., Lavrač, N., Webb, G.I.: Supervised descriptive rule discovery: a unifying survey of contrast set, emerging pattern and subgroup mining. J. Mach. Learn. Res. 10, 377–403 (2009)

    MATH  Google Scholar 

  13. Lavrač, N., Kavšek, B., Flach, P., Todorovski, L.: Subgroup discovery with CN2-SD. J. Mach. Learn. Res. 5, 153–188 (2004)

    MathSciNet  Google Scholar 

  14. Michalski, R.S.: On the quasi-minimal solution of the general covering problem. In: Proceedings of the 5th International Symposium on Information Processing (FCIP 1969), pp. 125–128, Bled, Yugoslavia (1969)

    Google Scholar 

  15. Michalski, R.S.: A theory and methodology of inductive learning. Artif. Intell. 20(2), 111–162 (1983)

    Article  MathSciNet  Google Scholar 

  16. Mitchell, T.M.: The Need for Biases in Learning Generalizations. Technical report, Computer Science Department, Rutgers University, New Brunswick, MA (1980)

    Google Scholar 

  17. Murphy, P.M., Pazzani, M.J.: Exploring the decision forest: an empirical investigation of Occam’s Razor in decision tree induction. J. Artif. Intell. Res. 1, 257–275 (1994)

    MATH  Google Scholar 

  18. Paulheim, H., Fürnkranz, J.: Unsupervised generation of data mining features from linked open data. In: Proceedings of the International Conference on Web Intelligence and Semantics (WIMS 2012) (2012)

    Google Scholar 

  19. Ristoski, P., Paulheim, H.: Analyzing statistics with background knowledge from linked open data. In: Proceedings of the 1st International Workshop on Semantic Statistics (SemStats-2013). CEUR workshop proceedings, Sydney, Australia (2013)

    Google Scholar 

  20. Stecher, J., Janssen, F., Fürnkranz, J.: Separating rule refinement and rule selection heuristics in inductive rule learning. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS (LNAI), vol. 8726, pp. 114–129. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44845-8_8

    Google Scholar 

  21. Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Computing iceberg concept lattices with Titanic. Data Knowl. Eng. 42(2), 189–222 (2002)

    Article  MATH  Google Scholar 

  22. Webb, G.I.: Further experimental evidence against the utility of Occam’s Razor. J. Artif. Intell. Res. 4, 397–417 (1996)

    MATH  Google Scholar 

  23. Wille, R.: Restructuring lattice theory: an approach based on hierarchies of concepts. In: Rival, I. (ed.) Ordered Sets, pp. 445–470. Reidel, Dordrecht-Boston (1982)

    Chapter  Google Scholar 

  24. Zaki, M.J., Hsiao, C.J.: CHARM: an efficient algorithm for closed itemset mining. In: Grossman, R.L., Han, J., Kumar, V., Mannila, H., Motwani, R. (eds.) Proceedings of the 2nd SIAM International Conference on Data Mining (SDM-02), pp. 457–473. Arlington, VA (2002)

    Google Scholar 

Download references

Acknowledgements

We would like to thank Dragan Gamberger, Nada Lavrač, and Heiko Paulheim for letting us play with their data.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Johannes Fürnkranz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Stecher, J., Janssen, F., Fürnkranz, J. (2016). Shorter Rules Are Better, Aren’t They?. In: Calders, T., Ceci, M., Malerba, D. (eds) Discovery Science. DS 2016. Lecture Notes in Computer Science(), vol 9956. Springer, Cham. https://doi.org/10.1007/978-3-319-46307-0_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46307-0_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46306-3

  • Online ISBN: 978-3-319-46307-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics