From Local to Global Patterns: Evaluation Issues in Rule Learning Algorithms

Fürnkranz, Johannes

doi:10.1007/11504245_2

From Local to Global Patterns: Evaluation Issues in Rule Learning Algorithms

Johannes Fürnkranz²¹

Conference paper

383 Accesses
9 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3539))

Abstract

Separate-and-conquer or covering rule learning algorithms may be viewed as a technique for using local pattern discovery for generating a global theory. Local patterns are learned one at a time, and each pattern is evaluated in a local context, with respect to the number of positive and negative examples that it covers. Global context is provided by removing the examples that are covered by previous patterns before learning a new rule. In this paper, we discuss several research issues that arise in this context. We start with a brief discussion of covering algorithms, their problems, and review a few suggestions for resolving them. We then discuss the suitability of a well-known family of evaluation metrics, and analyze how they trade off coverage and precision of a rule. Our conclusion is that in many applications, coverage is only needed for establishing statistical significance, and that the rule discovery process should focus on optimizing precision. As an alternative to coverage-based overfitting avoidance, we then investigate the feasibility of meta-learning a predictor for the true precision of a rule, based on its coverage on the training set. The results confirm that this is a valid approach, but also point at some shortcomings that need to be addressed in future work.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 307–328. AAAI Press, Menlo Park (1995)
Google Scholar
Cestnik, B.: Estimating probabilities: A crucial task in Machine Learning. In: Aiello, L. (ed.) Proceedings of the 9th European Conference on Artificial Intelligence (ECAI 1990), Stockholm, Sweden, pp. 147–150. Pitman (1990)
Google Scholar
Clark, P., Boswell, R.: Rule induction with CN2: Some recent improvements. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 151–163. Springer, Heidelberg (1991)
Chapter Google Scholar
Cohen, W.W.: Fast effective rule induction. In: Prieditis, A., Russell, S. (eds.) Proceedings of the 12th International Conference on Machine Learning (ML 1995), Lake Tahoe, CA, pp. 115–123. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Cohen, W.W., Singer, Y.: A simple, fast, and effective rule learner. In: Proceedings of the 16th National Conference on Artificial Intelligence (AAAI 1999), pp. 335–342. AAAI/MIT Press, Menlo Park (1999)
Google Scholar
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)
Chapter Google Scholar
Ferri, C., Flach, P., Hernández, J.: Delegating classifiers. In: Greiner, R., Schuurmans, D. (eds.) Proceedings of the 21st International Conference on Machine Learning (ICML 2004), Sydney, Australia, pp. 289–296. Omnipress (2004)
Google Scholar
Flach, P.A.: The geometry of ROC space: Using ROC isometrics to understand machine learning metrics. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the 20th International Conference on Machine Learning (ICML 2003), Washington, DC, pp. 194–201. AAAI Press, Menlo Park (2003)
Google Scholar
Frank, E., Witten, I.H.: Generating accurate rule sets without global optimization. In: Shavlik, J. (ed.) Proceedings of the 15th International Conference on Machine Learning (ICML 1998), Madison, Wisconsin, pp. 144–151. Morgan Kaufmann, San Francisco (1998)
Google Scholar
Fürnkranz, J.: Fossil: A robust relational learner. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 122–137. Springer, Heidelberg (1994)
Google Scholar
Fürnkranz, J.: Separate-and-conquer rule learning. Artificial Intelligence Review 13(1), 3–54 (February 1999)
Article MATH Google Scholar
Fürnkranz, J.: Modeling rule precision. In: Fürnkranz, J. (ed.) Proceedings of the ECML/PKDD 2004 Workshop on Advances in Inductive Rule Learning, Pisa, Italy, pp. 30–45 (2004a)
Google Scholar
Fürnkranz, J.: Modeling rule precision. In: Abecker, A., Bickel, S., Brefeld, U., Drost, I., Henze, N., Herden, O., Minor, M., Scheffer, T., Stojanovic, L., Weibelzahl, S. (eds.) Lernen – Wissensentdeckung — Adaptivität. Proceedings of the LWA 2004 Workshops, pp. 147–154. Humboldt-Universität zu Berlin (2004b)
Google Scholar
Fürnkranz, J., Flach, P.: An analysis of rule evaluation metrics. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the 20th International Conference on Machine Learning (ICML 2003), Washington, DC, pp. 202–209. AAAI Press, Menlo Park (2003)
Google Scholar
Fürnkranz, J., Flach, P.: ROC ‘n’ rule learning – Towards a better understanding of covering algorithms. Machine Learning 58(1), 39–77 (2005)
Article MATH Google Scholar
Gamberger, D., Lavrač, N.: Confirmation rule sets. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 34–43. Springer, Heidelberg (2000)
Chapter Google Scholar
Hipp, J., Güntzer, U., Nakhaeizadeh, G.: Algorithms for association rule mining – a general survey and comparison. SIGKDD Explorations 2(1), 58–64 (June 2000)
Article Google Scholar
Holte, R., Acker, L., Porter, B.: Concept learning and the problem of small disjuncts. In: Proceedings of the 11th International Joint Conference on Artificial Intelligence (IJCAI 1989), Detroit, MI, pp. 813–818. Morgan Kaufmann, San Francisco (1989)
Google Scholar
Klösgen, W.: Problems for knowledge discovery in databases and their treatment in the statistics interpreter EXPLORA. International Journal of Intelligent Systems 7(7), 649–673 (1992)
Article MATH Google Scholar
Klösgen, W.: Explora: A multipattern and multistrategy discovery assistant. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, ch. 10, pp. 249–271. AAAI Press, Menlo Park (1996)
Google Scholar
Lavrač, N., Flach, P.A., Zupan, B.: Rule evaluation measures: A unifying view. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS (LNAI), vol. 1634, pp. 174–185. Springer, Heidelberg (1999)
Chapter Google Scholar
Lavrač, N., Kavšek, B., Flach, P., Todorovski, L.: Subgroup discovery with CN2-SD. Journal of Machine Learning Research 5, 153–188 (2004)
Google Scholar
Major, J.A., Mangano, J.J.: Selecting among rules induced from a hurricane database. Journal of Intelligent Information Systems 4(1), 39–52 (1995)
Article Google Scholar
Piatetsky-Shapiro, G.: Discovery, analysis, and presentation of strong rules. In: Piatetsky-Shapiro, G., Frawley, W.J. (eds.) Knowledge Discovery in Databases, pp. 229–248. MIT Press, Cambridge (1991)
Google Scholar
Scheffer, T., Wrobel, S.: Finding the most interesting patterns in a database quickly by using sequential sampling. Journal of Machine Learning Research 3, 833–862 (2002)
Article MathSciNet Google Scholar
Tan, P.-N., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2002), Edmonton, Alberta, pp. 32–41 (2002)
Google Scholar
Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S, 4th edn. Springer, Heidelberg (2002)
MATH Google Scholar
Weiss, S.M., Indurkhya, N.: Lightweight rule induction. In: Langley, P. (ed.) Proceedings of the 17th International Conference on Machine Learning (ICML 2000), Stanford, CA, pp. 1135–1142 (2000)
Google Scholar
Wrobel, S.: An algorithm for multi-relational discovery of subgroups. In: Komorowski, J., Żytkow, J.M. (eds.) PKDD 1997. LNCS, vol. 1263, pp. 78–87. Springer, Heidelberg (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

Knowledge Engineering Group, TU Darmstadt, Hochschulstraße 10, D-64289, Darmstadt, Germany
Johannes Fürnkranz

Authors

Johannes Fürnkranz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science VIII, artificial Intelligence Unit, Technische Universität Dortmund, 44221, Dortmund, Germany
Katharina Morik
INSA-Lyon, LIRIS CNRS UMR5205, F-69621, Villeurbanne, France
Jean-François Boulicaut
Department of Computer Science, Universiteit Utrecht,
Arno Siebes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fürnkranz, J. (2005). From Local to Global Patterns: Evaluation Issues in Rule Learning Algorithms. In: Morik, K., Boulicaut, JF., Siebes, A. (eds) Local Pattern Detection. Lecture Notes in Computer Science(), vol 3539. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11504245_2

Download citation

DOI: https://doi.org/10.1007/11504245_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26543-6
Online ISBN: 978-3-540-31894-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics