Exceptional Model Mining

Knobbe, Arno; Feelders, Ad; Leman, Dennis

doi:10.1007/978-3-642-23241-1_9

Arno Knobbe⁵,
Ad Feelders⁶ &
Dennis Leman⁶

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 24))

1891 Accesses
1 Citations

Abstract

In most databases, it is possible to identify small partitions of the data where the observed distribution is notably different from that of the database as a whole. In classical subgroup discovery, one considers the distribution of a single nominal attribute, and exceptional subgroups show a surprising increase in the occurrence of one of its values. In this paper, we describe Exceptional Model Mining (EMM), a framework that allows for more complicated target concepts. Rather than finding subgroups based on the distribution of a single target attribute, EMM finds subgroups where a model fitted to that subgroup is somehow exceptional. We discuss regression as well as classification models, and define quality measures that determine how exceptional a given model on a subgroup is. Our framework is general enough to be applied to many types of models, even from other paradigms such as association analysis and graphical modeling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Affymetrix (1992), http://www.affymetrix.com/index.affx
Heckerman, D., Geiger, D., Chickering, D.: Learning Bayesian Networks: The combination of knowledge and statistical data. Machine Learning 20, 179–243 (1995)
Google Scholar
Klösgen, W.: Handbook of Data Mining and Knowledge Discovery. Subgroup Discovery, ch. 16.3. Oxford University Press, New York (2002)
MATH Google Scholar
Friedman, J., Fisher, N.: Bump-Hunting in High-Dimensional Data. Statistics and Computing 9(2), 123–143 (1999)
Article Google Scholar
Leman, D., Feelders, A., Knobbe, A.J.: Exceptional Model Mining. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 1–16. Springer, Heidelberg (2008)
Chapter Google Scholar
Knobbe, A.: Safarii multi-relational data mining environment (2006), http://www.kiminkii.com/safarii.html
Knobbe, A.J., Ho, E.K.Y.: Pattern teams. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 577–584. Springer, Heidelberg (2006)
Chapter Google Scholar
Kohavi, R.: The Power of Decision Tables. In: Proceedings ECML1995, London (1995)
Google Scholar
Anglin, P.M., Gençay, R.: Semiparametric Estimation of a Hedonic Price Function. Journal of Applied Econometrics 11(6), 633–648 (1996)
Article Google Scholar
van de Koppel, E., et al.: Knowledge Discovery in Neuroblastoma-related Biological Data. In: Data Mining in Functional Genomics and Proteomics workshop at PKDD 2007, Warsaw, Poland (2007)
Google Scholar
Moore, D., McCabe, G.: Introduction to the Practice of Statistics, New York (1993)
Google Scholar
Neter, J., Kutner, M., Nachtsheim, C.J., Wasserman, W.: Applied Linear Statistical Models. WCB McGraw-Hill, New York (1996)
Google Scholar
Yang, G., Le Cam, L.: Asymptotics in Statistics: Some Basic Concepts. Springer, Heidelberg (2000)
MATH Google Scholar
Xu, Y., Fern, A.: Learning Linear Ranking Functions for Beam Search. In: Proceedings ICML 2007 (2007)
Google Scholar
Niculescu-Mizil, A., Caruana, R.: Inductive Transfer for Bayesian Network Structure Learning. In: Proceedings of the 11th International Conference on AI and Statitics, AISTATS 2007 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

LIACS, Leiden University, Niels Bohrweg 1, NL-2333 CA, Leiden, The Netherlands
Arno Knobbe
Utrecht University, P.O. box 80 089, NL-3508 TB, Utrecht, The Netherlands
Ad Feelders & Dennis Leman

Authors

Arno Knobbe
View author publications
You can also search for this author in PubMed Google Scholar
Ad Feelders
View author publications
You can also search for this author in PubMed Google Scholar
Dennis Leman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Statistics andApplied Probability, University of California , 93106, Santa Barbara, CA, USA
Dawn E. Holmes
Knowledge-Based Engineering, University of South Australia, 5095, Adelaide Mawson Lakes, SA, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Knobbe, A., Feelders, A., Leman, D. (2012). Exceptional Model Mining. In: Holmes, D.E., Jain, L.C. (eds) Data Mining: Foundations and Intelligent Paradigms. Intelligent Systems Reference Library, vol 24. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23241-1_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-23241-1_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23240-4
Online ISBN: 978-3-642-23241-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics