Shorter Rules Are Better, Aren’t They?

Stecher, Julius; Janssen, Frederik; Fürnkranz, Johannes

doi:10.1007/978-3-319-46307-0_18

Julius Stecher¹⁶,
Frederik Janssen¹⁶ &
Johannes Fürnkranz¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9956))

Included in the following conference series:

International Conference on Discovery Science

1651 Accesses
4 Citations

Abstract

It is conventional wisdom in inductive rule learning that shorter rules should be preferred over longer rules, a principle also known as Occam’s Razor. This is typically justified with the fact that longer rules tend to be more specific and are therefore also more likely to overfit the data. In this position paper, we would like to challenge this assumption by demonstrating that variants of conventional rule learning heuristics, so-called inverted heuristics, learn longer rules that are not more specific than the shorter rules learned by conventional heuristics. Moreover, we will argue with some examples that such longer rules may in many cases be more understandable than shorter rules, again contradicting a widely held view. This is not only relevant for subgroup discovery but also for related concepts like characteristic rules, formal concept analysis, or closed itemsets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Entities should not be multiplied beyond necessity.

References

Bensusan, H.: God doesn’t always shave with Occam’s Razor - learning when and how to prune. In: Nédellec, C., Rouveirol, C. (eds.) Proceedings of the 10th European Conference on Machine Learning (ECML 1998), pp. 119–124 (1998)
Google Scholar
Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Occam’s Razor. Inf. Process. Lett. 24, 377–380 (1987)
Article MathSciNet MATH Google Scholar
Domingos, P.: The role of Occam’s Razor in knowledge discovery. Data Min. Knowl. Discovery 3(4), 409–425 (1999)
Article Google Scholar
Fürnkranz, J.: Separate-and-conquer rule learning. Artif. Intell. Rev. 13(1), 3–54 (1999)
Article MATH Google Scholar
Fürnkranz, J., Flach, P.A.: ROC ’n’ rule learning - towards a better understanding of covering algorithms. Mach. Learn. 58(1), 39–77 (2005)
Article MATH Google Scholar
Fürnkranz, J., Gamberger, D., Lavrač, N.: Foundations of Rule Learning. Springer, Heidelberg (2012)
Book MATH Google Scholar
Gamberger, D., Lavrač, N.: Active subgroup mining: a case study in coronary heart disease risk group detection. Artif. Intell. Med. 28(1), 27–57 (2003)
Article Google Scholar
Ganter, B., Wille, R.: Formal Concept Analysis - Mathematical Foundations. Springer, Heidelberg (1999)
Book MATH Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)
Article Google Scholar
Janssen, F., Fürnkranz, J.: On the quest for optimal rule learning heuristics. Mach. Learn. 78(3), 343–379 (2010)
Article MathSciNet Google Scholar
Kralj, P., Lavrač, N., Gamberger, D., Krstačić, A.: Contrast set mining through subgroup discovery applied to brain ischaemina data. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2016. LNCS (LNAI), vol. 4426, pp. 579–586. Springer, Heidelberg (2007). doi:10.1007/978-3-540-71701-0_61
Chapter Google Scholar
Kralj Novak, P., Lavrač, N., Webb, G.I.: Supervised descriptive rule discovery: a unifying survey of contrast set, emerging pattern and subgroup mining. J. Mach. Learn. Res. 10, 377–403 (2009)
MATH Google Scholar
Lavrač, N., Kavšek, B., Flach, P., Todorovski, L.: Subgroup discovery with CN2-SD. J. Mach. Learn. Res. 5, 153–188 (2004)
MathSciNet Google Scholar
Michalski, R.S.: On the quasi-minimal solution of the general covering problem. In: Proceedings of the 5th International Symposium on Information Processing (FCIP 1969), pp. 125–128, Bled, Yugoslavia (1969)
Google Scholar
Michalski, R.S.: A theory and methodology of inductive learning. Artif. Intell. 20(2), 111–162 (1983)
Article MathSciNet Google Scholar
Mitchell, T.M.: The Need for Biases in Learning Generalizations. Technical report, Computer Science Department, Rutgers University, New Brunswick, MA (1980)
Google Scholar
Murphy, P.M., Pazzani, M.J.: Exploring the decision forest: an empirical investigation of Occam’s Razor in decision tree induction. J. Artif. Intell. Res. 1, 257–275 (1994)
MATH Google Scholar
Paulheim, H., Fürnkranz, J.: Unsupervised generation of data mining features from linked open data. In: Proceedings of the International Conference on Web Intelligence and Semantics (WIMS 2012) (2012)
Google Scholar
Ristoski, P., Paulheim, H.: Analyzing statistics with background knowledge from linked open data. In: Proceedings of the 1st International Workshop on Semantic Statistics (SemStats-2013). CEUR workshop proceedings, Sydney, Australia (2013)
Google Scholar
Stecher, J., Janssen, F., Fürnkranz, J.: Separating rule refinement and rule selection heuristics in inductive rule learning. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS (LNAI), vol. 8726, pp. 114–129. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44845-8_8
Google Scholar
Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Computing iceberg concept lattices with Titanic. Data Knowl. Eng. 42(2), 189–222 (2002)
Article MATH Google Scholar
Webb, G.I.: Further experimental evidence against the utility of Occam’s Razor. J. Artif. Intell. Res. 4, 397–417 (1996)
MATH Google Scholar
Wille, R.: Restructuring lattice theory: an approach based on hierarchies of concepts. In: Rival, I. (ed.) Ordered Sets, pp. 445–470. Reidel, Dordrecht-Boston (1982)
Chapter Google Scholar
Zaki, M.J., Hsiao, C.J.: CHARM: an efficient algorithm for closed itemset mining. In: Grossman, R.L., Han, J., Kumar, V., Mannila, H., Motwani, R. (eds.) Proceedings of the 2nd SIAM International Conference on Data Mining (SDM-02), pp. 457–473. Arlington, VA (2002)
Google Scholar

Download references

Acknowledgements

We would like to thank Dragan Gamberger, Nada Lavrač, and Heiko Paulheim for letting us play with their data.

Author information

Authors and Affiliations

Technische Universität Darmstadt, Knowledge Engineering, Darmstadt, Germany
Julius Stecher, Frederik Janssen & Johannes Fürnkranz

Authors

Julius Stecher
View author publications
You can also search for this author in PubMed Google Scholar
Frederik Janssen
View author publications
You can also search for this author in PubMed Google Scholar
Johannes Fürnkranz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Johannes Fürnkranz .

Editor information

Editors and Affiliations

Campus Middelhe, M.G.103a, Universiteit Antwerpen Campus Middelhe, M.G.103a, Antwerp, Belgium
Toon Calders
Università degli Studi di Bari Aldo Moro, Bari, Italy
Michelangelo Ceci
Bari, Italy
Donato Malerba

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Stecher, J., Janssen, F., Fürnkranz, J. (2016). Shorter Rules Are Better, Aren’t They?. In: Calders, T., Ceci, M., Malerba, D. (eds) Discovery Science. DS 2016. Lecture Notes in Computer Science(), vol 9956. Springer, Cham. https://doi.org/10.1007/978-3-319-46307-0_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-46307-0_18
Published: 21 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46306-3
Online ISBN: 978-3-319-46307-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics