Skip to main content

Finding Explanations

  • Chapter
  • 8570 Accesses

Part of the book series: Texts in Computer Science ((TCS))

Abstract

In the previous chapter we have discussed methods that find patterns of different shapes in data sets. All these methods needed measures of similarity in order to group similar objects. In this chapter we will discuss methods that address a very different setup: instead of finding structure in a data set, we are now focusing on methods that find explanations for an unknown dependency within the data. Such a search for a dependency usually focuses on a so-called target attribute, that is, we are particularly interested in why one specific attribute has a certain value. In case of the target attribute being a nominal variable, we are talking about a classification problem; in case of a numerical value we are referring to a regression problem. Examples for such problems would be understanding why a customer belongs to the category of people who cancel their account (e.g., classifying her into a yes/no category) or better understanding the risk factors of customers in general.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Rumors say that ID3 stands for “Iterative Dichotomiser 3” (from Greek dichotomia: divided), supposedly it was Quinlans’ third attempt. Another interpretation of one of the authors is “Induction of Decision 3rees.”

  2. 2.

    Quinlan later also developed methods for regression problems, similar to CART.

  3. 3.

    Since one or more of them may be metric, we may have to use a probability density function f to refer to descriptive attributes: f(xy). However, we ignore such notational subtleties here.

  4. 4.

    This is a prior probability, because it describes the class probability before observing the values of any descriptive attributes.

  5. 5.

    For more details see also Sect. 5.4.

  6. 6.

    Note, however, that this second property can also be a disadvantage, as it can make outliers have an overly strong influence on the regression result.

  7. 7.

    Note that we are not saying much about the truthfulness or precision of rules at this stage.

  8. 8.

    Note that this is a substantial deviation from the abstract concepts of rule learners in Mitchell’s version space setup: real-world rule learners usually do not investigate all more general (or more specific) rules but only a subset of those chosen by the employed heuristic(s).

References

  1. Albert, A.: Regression and the Moore–Penrose Pseudoinverse. Academic Press, New York (1972)

    MATH  Google Scholar 

  2. Anderson, E.: The irises of the Gaspe Peninsula. Bull. Am. Iris Soc. 59, 2–5 (1935)

    Google Scholar 

  3. Berthold, M.R.: Fuzzy logic. In: Berthold, M.R., Hand, D.J. (eds.) Intelligent Data Analysis: An Introduction, 2nd edn. Springer, Berlin (2003)

    Chapter  Google Scholar 

  4. Borgelt, C., Steinbrecher, M., Kruse, R.: Graphical Models—Representations for Learning, Reasoning and Data Mining, 2nd edn. Wiley, Chichester (2009)

    MATH  Google Scholar 

  5. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: CART: Classification and Regression Trees. Wadsworth, Belmont (1983)

    Google Scholar 

  6. Clark, P., Niblett, T.: The CN2 induction algorithm. Mach. Learn. 3(4), 261–283 (1989)

    Google Scholar 

  7. Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Mach. Learn. 29, 103–137 (1997)

    Article  MATH  Google Scholar 

  8. Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7(2), 179–188 (1936)

    Article  Google Scholar 

  9. Friedman, N., Goldszmidt, M.: Building classifiers using Bayesian networks. In: Proc. 13th Nat. Conf. on Artificial Intelligence (AAAI’96, Portland, OR, USA), pp. 1277–1284. AAAI Press, Menlo Park (1996)

    Google Scholar 

  10. Geiger, D.: An entropy-based learning algorithm of Bayesian conditional trees. In: Proc. 8th Conf. on Uncertainty in Artificial Intelligence (UAI’92, Stanford, CA, USA), pp. 92–97. Morgan Kaufmann, San Mateo (1992)

    Google Scholar 

  11. Goodman, R.M., Smyth, P.: An information-theoretic model for rule-based expert systems. In: Int. Symposium in Information Theory. Kobe, Japan (1988)

    Google Scholar 

  12. Janikow, C.Z.: Fuzzy decision trees: issues and methods. IEEE Trans. Syst. Man, Cybern., Part B 28(1), 1–14 (1998)

    Article  Google Scholar 

  13. Jensen, F.V., Nielsen, T.D.: Bayesian Networks and Decision Graphs, 2nd edn. Springer, London (2007)

    Book  MATH  Google Scholar 

  14. Larrañaga, P., Poza, M., Yurramendi, Y., Murga, R., Kuijpers, C.: Structural learning of Bayesian networks by genetic algorithms: a performance analysis of control parameters. IEEE Trans. Pattern Anal. Mach. Intell. 18, 912–926 (1996)

    Article  Google Scholar 

  15. Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)

    MATH  Google Scholar 

  16. Nauck, D., Klawonn, F., Kruse, R.: Neuro-Fuzzy Systems. Wiley, Chichester (1997)

    Google Scholar 

  17. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

    Google Scholar 

  18. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  19. Quinlan, J.R., Cameron-Jones, R.M.: FOIL: a midterm report. In: Proc. European Conference on Machine Learning. Lecture Notes in Computer Science, vol. 667, pp. 3–20. Springer, Berlin (1993)

    Google Scholar 

  20. Sahami, M.: Learning limited dependence Bayesian classifiers. In: Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining (KDD’96, Portland, OR, USA), pp. 335–338. AAAI Press, Menlo Park (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael R. Berthold .

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag London Limited

About this chapter

Cite this chapter

Berthold, M.R., Borgelt, C., Höppner, F., Klawonn, F. (2010). Finding Explanations. In: Guide to Intelligent Data Analysis. Texts in Computer Science. Springer, London. https://doi.org/10.1007/978-1-84882-260-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-1-84882-260-3_8

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84882-259-7

  • Online ISBN: 978-1-84882-260-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics