Bayesian Confirmation Measures in Rule-Based Classification

Brzezinski, Dariusz; Grudziński, Zbigniew; Szczęch, Izabela

doi:10.1007/978-3-319-61461-8_3

Bayesian Confirmation Measures in Rule-Based Classification

Dariusz Brzezinski¹⁸,
Zbigniew Grudziński¹⁸ &
Izabela Szczęch¹⁸

Conference paper
First Online: 02 July 2017

571 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10312))

Abstract

With the rapid growth of available data, learning models are also gaining in sizes. As a result, end-users are often faced with classification results that are hard to understand. This problem also involves rule-based classifiers, which usually concentrate on predictive accuracy and produce too many rules for a human expert to interpret. In this paper, we tackle the problem of pruning rule classifiers while retaining their descriptive properties. For this purpose, we analyze the use of confirmation measures as representatives of interestingness measures designed to select rules with desirable descriptive properties. To perform the analysis, we put forward the CM-CAR algorithm, which uses interestingness measures during rule pruning. Experiments involving 20 datasets show that out of 12 analyzed confirmation measures \(c_1\), F, and Z are best for general-purpose rule pruning and sorting. An additional analysis comparing results on balanced/imbalanced and binary/multi-class problems highlights also N, S, and \(c_3\) as measures for sorting rules on binary imbalanced datasets. The obtained results can be used to devise new classifiers that optimize confirmation measures during model training.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Sources available at: http://www.cs.put.poznan.pl/dbrzezinski/software.php.
2.
Supplement: http://www.cs.put.poznan.pl/dbrzezinski/software/CMCAR.html.

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499 (1994)
Google Scholar
Brzezinski, D., Stefanowski, J.: Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf. Sci. 265, 50–67 (2014)
Article MathSciNet MATH Google Scholar
Carnap, R.: Logical Foundations of Probability. University of Chicago Press, Chicago (1962)
MATH Google Scholar
Ceci, M., Appice, A.: Spatial associative classification: propositional vs structural approach. J. Intell. Inf. Syst. 27(3), 191–213 (2006)
Article Google Scholar
Christensen, D.: Measuring confirmation. J. Philos. 96, 437–461 (1999)
Article MathSciNet Google Scholar
Cohen, W.W.: Fast effective rule induction. In: Proceedings of the 12th International Conference on Machine Learning (ICML 1995), pp. 115–123 (1995)
Google Scholar
Crupi, V., Tentori, K., Gonzalez, M.: On Bayesian measures of evidential support: theoretical and empirical issues. Philos. Sci. 74, 229–252 (2007)
Article MathSciNet Google Scholar
Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
MathSciNet MATH Google Scholar
Domingos, P.: The rough set based rule induction technique for classification problems. In. In Proceedings of the Sixth IEEE International Conference on Tools with Artificial Intelligence, pp. 704–707 (1994)
Google Scholar
Dong, G., Zhang, X., Wong, L., Li, J.: CAEP: classification by aggregating emerging patterns. In: Arikawa, S., Furukawa, K. (eds.) DS 1999. LNCS (LNAI), vol. 1721, pp. 30–42. Springer, Heidelberg (1999). doi:10.1007/3-540-46846-3_4
Chapter Google Scholar
Eells, E.: Rational Decision and Causality. Cambridge University Press, Cambridge (1982)
Book MATH Google Scholar
Fitelson, B.: The plurality of Bayesian measures of confirmation and the problem of measure sensitivity. Philos. Sci. 66, 362–378 (1999)
Article MathSciNet Google Scholar
Frank, E., Witten, I.H.: Generating accurate rule sets without global optimization. In: Shavlik, J. (ed.) Fifteenth International Conference on Machine Learning, pp. 144–151. Morgan Kaufmann, Burlington (1998)
Google Scholar
Geng, L., Hamilton, H.: Interestingness measures for data mining: a survey. ACM Comput. Surv. 38(3) (2006). Article no. 9
Google Scholar
Glass, D.H.: Confirmation measures of association rule interestingness. Knowl. Based Syst. 44, 65–77 (2013)
Article Google Scholar
Greco, S., Słowiński, R., Szczȩch, I.: Properties of rule interestingness measures and alternative approaches to normalization of measures. Inf. Sci. 216, 1–16 (2012)
Article MathSciNet MATH Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Article Google Scholar
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers Inc., Burlington (2011)
MATH Google Scholar
Japkowicz, N.: Assessment metrics for imbalanced learning. In: He, H., Ma, Y. (eds.) Imbalanced Learning: Foundations, Algorithms, and Applications, pp. 187–206. Wiley-IEEE Press, Hoboken (2013)
Chapter Google Scholar
Kantardzic, M.: Data-mining applications. In: Data Mining: Concepts, Models, Methods, and Algorithms, 2 edn, pp. 496–509. Wiley (2011)
Google Scholar
Kemeny, J., Oppenheim, P.: Degrees of factual support. Philos. Sci. 19, 307–324 (1952)
Article Google Scholar
Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining, pp. 80–86 (1998)
Google Scholar
McGarry, K.: A survey of interestingness measures for knowledge discovery. Knowl. Eng. Rev. 20(1), 39–61 (2005)
Article Google Scholar
Mortimer, H.: The Logic of Induction. Prentice Hall, Paramus (1988)
MATH Google Scholar
Napierala, K., Stefanowski, J.: Addressing imbalanced data with argument based rule learning. Expert Syst. Appl. 42(24), 9468–9481 (2015)
Article Google Scholar
Nozick, R.: Philosophical Explanations. Clarendon Press, Oxford (1981)
Google Scholar
Stefanowski, J.: The rough set based rule induction technique for classification problems. In. In Proceedings of 6th European Conference on Intelligent Techniques and Soft Computing EUFIT, vol. 98 (1998)
Google Scholar

Download references

Acknowledgements

This work was supported by the National Science Centre grant DEC-2013/11/B/ST6/00963. D. Brzezinski acknowledges the support of an FNP START scholarship and Institute of Computing Science Statutory Fund.

Author information

Authors and Affiliations

Institute of Computing Science, Poznan University of Technology, ul. Piotrowo 2, 60–965, Poznan, Poland
Dariusz Brzezinski, Zbigniew Grudziński & Izabela Szczęch

Authors

Dariusz Brzezinski
View author publications
You can also search for this author in PubMed Google Scholar
Zbigniew Grudziński
View author publications
You can also search for this author in PubMed Google Scholar
Izabela Szczęch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dariusz Brzezinski .

Editor information

Editors and Affiliations

Università degli Studi di Bari Aldo Moro, Bari, Italy
Annalisa Appice
Università degli Studi di Bari Aldo Moro, Bari, Italy
Michelangelo Ceci
Università degli Studi di Bari Aldo Moro, Bari, Italy
Corrado Loglisci
ICAR-CNR, Rende, Italy
Elio Masciari
University of North Carolina, Charlotte, North Carolina, USA
Zbigniew W. Raś

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brzezinski, D., Grudziński, Z., Szczęch, I. (2017). Bayesian Confirmation Measures in Rule-Based Classification. In: Appice, A., Ceci, M., Loglisci, C., Masciari, E., Raś, Z. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2016. Lecture Notes in Computer Science(), vol 10312. Springer, Cham. https://doi.org/10.1007/978-3-319-61461-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-61461-8_3
Published: 02 July 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61460-1
Online ISBN: 978-3-319-61461-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics