Skip to main content

From Local to Global Patterns: Evaluation Issues in Rule Learning Algorithms

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3539))

Abstract

Separate-and-conquer or covering rule learning algorithms may be viewed as a technique for using local pattern discovery for generating a global theory. Local patterns are learned one at a time, and each pattern is evaluated in a local context, with respect to the number of positive and negative examples that it covers. Global context is provided by removing the examples that are covered by previous patterns before learning a new rule. In this paper, we discuss several research issues that arise in this context. We start with a brief discussion of covering algorithms, their problems, and review a few suggestions for resolving them. We then discuss the suitability of a well-known family of evaluation metrics, and analyze how they trade off coverage and precision of a rule. Our conclusion is that in many applications, coverage is only needed for establishing statistical significance, and that the rule discovery process should focus on optimizing precision. As an alternative to coverage-based overfitting avoidance, we then investigate the feasibility of meta-learning a predictor for the true precision of a rule, based on its coverage on the training set. The results confirm that this is a valid approach, but also point at some shortcomings that need to be addressed in future work.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 307–328. AAAI Press, Menlo Park (1995)

    Google Scholar 

  • Cestnik, B.: Estimating probabilities: A crucial task in Machine Learning. In: Aiello, L. (ed.) Proceedings of the 9th European Conference on Artificial Intelligence (ECAI 1990), Stockholm, Sweden, pp. 147–150. Pitman (1990)

    Google Scholar 

  • Clark, P., Boswell, R.: Rule induction with CN2: Some recent improvements. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 151–163. Springer, Heidelberg (1991)

    Chapter  Google Scholar 

  • Cohen, W.W.: Fast effective rule induction. In: Prieditis, A., Russell, S. (eds.) Proceedings of the 12th International Conference on Machine Learning (ML 1995), Lake Tahoe, CA, pp. 115–123. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  • Cohen, W.W., Singer, Y.: A simple, fast, and effective rule learner. In: Proceedings of the 16th National Conference on Artificial Intelligence (AAAI 1999), pp. 335–342. AAAI/MIT Press, Menlo Park (1999)

    Google Scholar 

  • Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  • Ferri, C., Flach, P., Hernández, J.: Delegating classifiers. In: Greiner, R., Schuurmans, D. (eds.) Proceedings of the 21st International Conference on Machine Learning (ICML 2004), Sydney, Australia, pp. 289–296. Omnipress (2004)

    Google Scholar 

  • Flach, P.A.: The geometry of ROC space: Using ROC isometrics to understand machine learning metrics. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the 20th International Conference on Machine Learning (ICML 2003), Washington, DC, pp. 194–201. AAAI Press, Menlo Park (2003)

    Google Scholar 

  • Frank, E., Witten, I.H.: Generating accurate rule sets without global optimization. In: Shavlik, J. (ed.) Proceedings of the 15th International Conference on Machine Learning (ICML 1998), Madison, Wisconsin, pp. 144–151. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  • Fürnkranz, J.: Fossil: A robust relational learner. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 122–137. Springer, Heidelberg (1994)

    Google Scholar 

  • Fürnkranz, J.: Separate-and-conquer rule learning. Artificial Intelligence Review 13(1), 3–54 (February 1999)

    Article  MATH  Google Scholar 

  • Fürnkranz, J.: Modeling rule precision. In: Fürnkranz, J. (ed.) Proceedings of the ECML/PKDD 2004 Workshop on Advances in Inductive Rule Learning, Pisa, Italy, pp. 30–45 (2004a)

    Google Scholar 

  • Fürnkranz, J.: Modeling rule precision. In: Abecker, A., Bickel, S., Brefeld, U., Drost, I., Henze, N., Herden, O., Minor, M., Scheffer, T., Stojanovic, L., Weibelzahl, S. (eds.) Lernen – Wissensentdeckung — Adaptivität. Proceedings of the LWA 2004 Workshops, pp. 147–154. Humboldt-Universität zu Berlin (2004b)

    Google Scholar 

  • Fürnkranz, J., Flach, P.: An analysis of rule evaluation metrics. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the 20th International Conference on Machine Learning (ICML 2003), Washington, DC, pp. 202–209. AAAI Press, Menlo Park (2003)

    Google Scholar 

  • Fürnkranz, J., Flach, P.: ROC ‘n’ rule learning – Towards a better understanding of covering algorithms. Machine Learning 58(1), 39–77 (2005)

    Article  MATH  Google Scholar 

  • Gamberger, D., Lavrač, N.: Confirmation rule sets. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 34–43. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  • Hipp, J., Güntzer, U., Nakhaeizadeh, G.: Algorithms for association rule mining – a general survey and comparison. SIGKDD Explorations 2(1), 58–64 (June 2000)

    Article  Google Scholar 

  • Holte, R., Acker, L., Porter, B.: Concept learning and the problem of small disjuncts. In: Proceedings of the 11th International Joint Conference on Artificial Intelligence (IJCAI 1989), Detroit, MI, pp. 813–818. Morgan Kaufmann, San Francisco (1989)

    Google Scholar 

  • Klösgen, W.: Problems for knowledge discovery in databases and their treatment in the statistics interpreter EXPLORA. International Journal of Intelligent Systems 7(7), 649–673 (1992)

    Article  MATH  Google Scholar 

  • Klösgen, W.: Explora: A multipattern and multistrategy discovery assistant. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, ch. 10, pp. 249–271. AAAI Press, Menlo Park (1996)

    Google Scholar 

  • Lavrač, N., Flach, P.A., Zupan, B.: Rule evaluation measures: A unifying view. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS (LNAI), vol. 1634, pp. 174–185. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  • Lavrač, N., Kavšek, B., Flach, P., Todorovski, L.: Subgroup discovery with CN2-SD. Journal of Machine Learning Research 5, 153–188 (2004)

    Google Scholar 

  • Major, J.A., Mangano, J.J.: Selecting among rules induced from a hurricane database. Journal of Intelligent Information Systems 4(1), 39–52 (1995)

    Article  Google Scholar 

  • Piatetsky-Shapiro, G.: Discovery, analysis, and presentation of strong rules. In: Piatetsky-Shapiro, G., Frawley, W.J. (eds.) Knowledge Discovery in Databases, pp. 229–248. MIT Press, Cambridge (1991)

    Google Scholar 

  • Scheffer, T., Wrobel, S.: Finding the most interesting patterns in a database quickly by using sequential sampling. Journal of Machine Learning Research 3, 833–862 (2002)

    Article  MathSciNet  Google Scholar 

  • Tan, P.-N., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2002), Edmonton, Alberta, pp. 32–41 (2002)

    Google Scholar 

  • Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S, 4th edn. Springer, Heidelberg (2002)

    MATH  Google Scholar 

  • Weiss, S.M., Indurkhya, N.: Lightweight rule induction. In: Langley, P. (ed.) Proceedings of the 17th International Conference on Machine Learning (ICML 2000), Stanford, CA, pp. 1135–1142 (2000)

    Google Scholar 

  • Wrobel, S.: An algorithm for multi-relational discovery of subgroups. In: Komorowski, J., Żytkow, J.M. (eds.) PKDD 1997. LNCS, vol. 1263, pp. 78–87. Springer, Heidelberg (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fürnkranz, J. (2005). From Local to Global Patterns: Evaluation Issues in Rule Learning Algorithms. In: Morik, K., Boulicaut, JF., Siebes, A. (eds) Local Pattern Detection. Lecture Notes in Computer Science(), vol 3539. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11504245_2

Download citation

  • DOI: https://doi.org/10.1007/11504245_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26543-6

  • Online ISBN: 978-3-540-31894-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics