Finding Explanations

Berthold, Michael R.; Borgelt, Christian; Höppner, Frank; Klawonn, Frank

doi:10.1007/978-1-84882-260-3_8

Finding Explanations

Michael R. Berthold⁶,
Christian Borgelt⁷,
Frank Höppner⁸ &
…
Frank Klawonn⁹

Chapter

8570 Accesses

Part of the book series: Texts in Computer Science ((TCS))

Abstract

In the previous chapter we have discussed methods that find patterns of different shapes in data sets. All these methods needed measures of similarity in order to group similar objects. In this chapter we will discuss methods that address a very different setup: instead of finding structure in a data set, we are now focusing on methods that find explanations for an unknown dependency within the data. Such a search for a dependency usually focuses on a so-called target attribute, that is, we are particularly interested in why one specific attribute has a certain value. In case of the target attribute being a nominal variable, we are talking about a classification problem; in case of a numerical value we are referring to a regression problem. Examples for such problems would be understanding why a customer belongs to the category of people who cancel their account (e.g., classifying her into a yes/no category) or better understanding the risk factors of customers in general.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Rumors say that ID3 stands for “Iterative Dichotomiser 3” (from Greek dichotomia: divided), supposedly it was Quinlans’ third attempt. Another interpretation of one of the authors is “Induction of Decision 3rees.”
2.
Quinlan later also developed methods for regression problems, similar to CART.
3.
Since one or more of them may be metric, we may have to use a probability density function f to refer to descriptive attributes: f(x∣y). However, we ignore such notational subtleties here.
4.
This is a prior probability, because it describes the class probability before observing the values of any descriptive attributes.
5.
For more details see also Sect. 5.4.
6.
Note, however, that this second property can also be a disadvantage, as it can make outliers have an overly strong influence on the regression result.
7.
Note that we are not saying much about the truthfulness or precision of rules at this stage.
8.
Note that this is a substantial deviation from the abstract concepts of rule learners in Mitchell’s version space setup: real-world rule learners usually do not investigate all more general (or more specific) rules but only a subset of those chosen by the employed heuristic(s).

References

Albert, A.: Regression and the Moore–Penrose Pseudoinverse. Academic Press, New York (1972)
MATH Google Scholar
Anderson, E.: The irises of the Gaspe Peninsula. Bull. Am. Iris Soc. 59, 2–5 (1935)
Google Scholar
Berthold, M.R.: Fuzzy logic. In: Berthold, M.R., Hand, D.J. (eds.) Intelligent Data Analysis: An Introduction, 2nd edn. Springer, Berlin (2003)
Chapter Google Scholar
Borgelt, C., Steinbrecher, M., Kruse, R.: Graphical Models—Representations for Learning, Reasoning and Data Mining, 2nd edn. Wiley, Chichester (2009)
MATH Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: CART: Classification and Regression Trees. Wadsworth, Belmont (1983)
Google Scholar
Clark, P., Niblett, T.: The CN2 induction algorithm. Mach. Learn. 3(4), 261–283 (1989)
Google Scholar
Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Mach. Learn. 29, 103–137 (1997)
Article MATH Google Scholar
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7(2), 179–188 (1936)
Article Google Scholar
Friedman, N., Goldszmidt, M.: Building classifiers using Bayesian networks. In: Proc. 13th Nat. Conf. on Artificial Intelligence (AAAI’96, Portland, OR, USA), pp. 1277–1284. AAAI Press, Menlo Park (1996)
Google Scholar
Geiger, D.: An entropy-based learning algorithm of Bayesian conditional trees. In: Proc. 8th Conf. on Uncertainty in Artificial Intelligence (UAI’92, Stanford, CA, USA), pp. 92–97. Morgan Kaufmann, San Mateo (1992)
Google Scholar
Goodman, R.M., Smyth, P.: An information-theoretic model for rule-based expert systems. In: Int. Symposium in Information Theory. Kobe, Japan (1988)
Google Scholar
Janikow, C.Z.: Fuzzy decision trees: issues and methods. IEEE Trans. Syst. Man, Cybern., Part B 28(1), 1–14 (1998)
Article Google Scholar
Jensen, F.V., Nielsen, T.D.: Bayesian Networks and Decision Graphs, 2nd edn. Springer, London (2007)
Book MATH Google Scholar
Larrañaga, P., Poza, M., Yurramendi, Y., Murga, R., Kuijpers, C.: Structural learning of Bayesian networks by genetic algorithms: a performance analysis of control parameters. IEEE Trans. Pattern Anal. Mach. Intell. 18, 912–926 (1996)
Article Google Scholar
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
MATH Google Scholar
Nauck, D., Klawonn, F., Kruse, R.: Neuro-Fuzzy Systems. Wiley, Chichester (1997)
Google Scholar
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Google Scholar
Quinlan, J.R., Cameron-Jones, R.M.: FOIL: a midterm report. In: Proc. European Conference on Machine Learning. Lecture Notes in Computer Science, vol. 667, pp. 3–20. Springer, Berlin (1993)
Google Scholar
Sahami, M.: Learning limited dependence Bayesian classifiers. In: Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining (KDD’96, Portland, OR, USA), pp. 335–338. AAAI Press, Menlo Park (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

FB Informatik und Informationswissenschaft, Universität Konstanz, 78457, Konstanz, Germany
Prof. Dr. Michael R. Berthold
Intelligent Data Analysis & Graphical Models Research Unit, European Centre for Soft Computing, C/ Gonzalo Gutiérrez Quirós s/n Edificio Científico-Technológico Campus Mieres, 3ª Planta, 33600, Mieres, Asturias, Spain
Dr. Christian Borgelt
FB Wirtschaft, Ostfalia University of Applied Sciences, Robert-Koch-Platz 10-14, 38440, Wolfsburg, Germany
Prof. Dr. Frank Höppner
FB Informatik, Ostfalia University of Applied Sciences, Salzdahlumer Str. 46/48, 38302, Wolfenbüttel, Germany
Prof. Dr. Frank Klawonn

Authors

Prof. Dr. Michael R. Berthold
View author publications
You can also search for this author in PubMed Google Scholar
Dr. Christian Borgelt
View author publications
You can also search for this author in PubMed Google Scholar
Prof. Dr. Frank Höppner
View author publications
You can also search for this author in PubMed Google Scholar
Prof. Dr. Frank Klawonn
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael R. Berthold .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Berthold, M.R., Borgelt, C., Höppner, F., Klawonn, F. (2010). Finding Explanations. In: Guide to Intelligent Data Analysis. Texts in Computer Science. Springer, London. https://doi.org/10.1007/978-1-84882-260-3_8

Download citation

DOI: https://doi.org/10.1007/978-1-84882-260-3_8
Publisher Name: Springer, London
Print ISBN: 978-1-84882-259-7
Online ISBN: 978-1-84882-260-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics