Skip to main content

Bayesian Confirmation Measures in Rule-Based Classification

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10312))

Abstract

With the rapid growth of available data, learning models are also gaining in sizes. As a result, end-users are often faced with classification results that are hard to understand. This problem also involves rule-based classifiers, which usually concentrate on predictive accuracy and produce too many rules for a human expert to interpret. In this paper, we tackle the problem of pruning rule classifiers while retaining their descriptive properties. For this purpose, we analyze the use of confirmation measures as representatives of interestingness measures designed to select rules with desirable descriptive properties. To perform the analysis, we put forward the CM-CAR algorithm, which uses interestingness measures during rule pruning. Experiments involving 20 datasets show that out of 12 analyzed confirmation measures \(c_1\), F, and Z are best for general-purpose rule pruning and sorting. An additional analysis comparing results on balanced/imbalanced and binary/multi-class problems highlights also N, S, and \(c_3\) as measures for sorting rules on binary imbalanced datasets. The obtained results can be used to devise new classifiers that optimize confirmation measures during model training.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Sources available at: http://www.cs.put.poznan.pl/dbrzezinski/software.php.

  2. 2.

    Supplement: http://www.cs.put.poznan.pl/dbrzezinski/software/CMCAR.html.

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499 (1994)

    Google Scholar 

  2. Brzezinski, D., Stefanowski, J.: Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf. Sci. 265, 50–67 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  3. Carnap, R.: Logical Foundations of Probability. University of Chicago Press, Chicago (1962)

    MATH  Google Scholar 

  4. Ceci, M., Appice, A.: Spatial associative classification: propositional vs structural approach. J. Intell. Inf. Syst. 27(3), 191–213 (2006)

    Article  Google Scholar 

  5. Christensen, D.: Measuring confirmation. J. Philos. 96, 437–461 (1999)

    Article  MathSciNet  Google Scholar 

  6. Cohen, W.W.: Fast effective rule induction. In: Proceedings of the 12th International Conference on Machine Learning (ICML 1995), pp. 115–123 (1995)

    Google Scholar 

  7. Crupi, V., Tentori, K., Gonzalez, M.: On Bayesian measures of evidential support: theoretical and empirical issues. Philos. Sci. 74, 229–252 (2007)

    Article  MathSciNet  Google Scholar 

  8. Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  9. Domingos, P.: The rough set based rule induction technique for classification problems. In. In Proceedings of the Sixth IEEE International Conference on Tools with Artificial Intelligence, pp. 704–707 (1994)

    Google Scholar 

  10. Dong, G., Zhang, X., Wong, L., Li, J.: CAEP: classification by aggregating emerging patterns. In: Arikawa, S., Furukawa, K. (eds.) DS 1999. LNCS (LNAI), vol. 1721, pp. 30–42. Springer, Heidelberg (1999). doi:10.1007/3-540-46846-3_4

    Chapter  Google Scholar 

  11. Eells, E.: Rational Decision and Causality. Cambridge University Press, Cambridge (1982)

    Book  MATH  Google Scholar 

  12. Fitelson, B.: The plurality of Bayesian measures of confirmation and the problem of measure sensitivity. Philos. Sci. 66, 362–378 (1999)

    Article  MathSciNet  Google Scholar 

  13. Frank, E., Witten, I.H.: Generating accurate rule sets without global optimization. In: Shavlik, J. (ed.) Fifteenth International Conference on Machine Learning, pp. 144–151. Morgan Kaufmann, Burlington (1998)

    Google Scholar 

  14. Geng, L., Hamilton, H.: Interestingness measures for data mining: a survey. ACM Comput. Surv. 38(3) (2006). Article no. 9

    Google Scholar 

  15. Glass, D.H.: Confirmation measures of association rule interestingness. Knowl. Based Syst. 44, 65–77 (2013)

    Article  Google Scholar 

  16. Greco, S., Słowiński, R., Szczȩch, I.: Properties of rule interestingness measures and alternative approaches to normalization of measures. Inf. Sci. 216, 1–16 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  17. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)

    Article  Google Scholar 

  18. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers Inc., Burlington (2011)

    MATH  Google Scholar 

  19. Japkowicz, N.: Assessment metrics for imbalanced learning. In: He, H., Ma, Y. (eds.) Imbalanced Learning: Foundations, Algorithms, and Applications, pp. 187–206. Wiley-IEEE Press, Hoboken (2013)

    Chapter  Google Scholar 

  20. Kantardzic, M.: Data-mining applications. In: Data Mining: Concepts, Models, Methods, and Algorithms, 2 edn, pp. 496–509. Wiley (2011)

    Google Scholar 

  21. Kemeny, J., Oppenheim, P.: Degrees of factual support. Philos. Sci. 19, 307–324 (1952)

    Article  Google Scholar 

  22. Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml

  23. Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining, pp. 80–86 (1998)

    Google Scholar 

  24. McGarry, K.: A survey of interestingness measures for knowledge discovery. Knowl. Eng. Rev. 20(1), 39–61 (2005)

    Article  Google Scholar 

  25. Mortimer, H.: The Logic of Induction. Prentice Hall, Paramus (1988)

    MATH  Google Scholar 

  26. Napierala, K., Stefanowski, J.: Addressing imbalanced data with argument based rule learning. Expert Syst. Appl. 42(24), 9468–9481 (2015)

    Article  Google Scholar 

  27. Nozick, R.: Philosophical Explanations. Clarendon Press, Oxford (1981)

    Google Scholar 

  28. Stefanowski, J.: The rough set based rule induction technique for classification problems. In. In Proceedings of 6th European Conference on Intelligent Techniques and Soft Computing EUFIT, vol. 98 (1998)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Science Centre grant DEC-2013/11/B/ST6/00963. D. Brzezinski acknowledges the support of an FNP START scholarship and Institute of Computing Science Statutory Fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dariusz Brzezinski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Brzezinski, D., Grudziński, Z., Szczęch, I. (2017). Bayesian Confirmation Measures in Rule-Based Classification. In: Appice, A., Ceci, M., Loglisci, C., Masciari, E., Raś, Z. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2016. Lecture Notes in Computer Science(), vol 10312. Springer, Cham. https://doi.org/10.1007/978-3-319-61461-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-61461-8_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-61460-1

  • Online ISBN: 978-3-319-61461-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics