Abstract
This paper addresses the discovery of surprising patterns. Recently, several authors have addressed the task of discovering surprising prediction rules. However, we do not focus on prediction rules, but rather on a quite different kind of pattern, namely the occurrence of Simpson’s paradox. Intuitively, the fact that this is a paradox suggests that it has a great potential to be a surprising pattern for the user. With this motivation, we make the detection of Simpson’s paradox the central goal of a data mining algorithm explicitly designed to discover surprising patterns. We present computational results showing surprising occurrences of the paradox in some public-domain data sets. In addition, we propose a method for ranking the discovered instances of the paradox in decreasing order of estimated degree of surprisingness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
G. Dong & J. Li. Interestingness of discovered association rules in terms of neighborhood-based unexpectedness. Research and Development in Knowledge Discovery & Data Mining (Proa 2 nd Pacific-Asian Conf., PAKDD-98). LNAI 1394, 72–86. Springer-Verlag, 1998.
A.A. Freitas . On objective measures of rule surprisingness. Principles of Data Mining and Knowledge Discovery: Proc. 2nd European Symp. (PKDD’98). LNAI 1510, 1–9. Nantes, France, Sep. 1998.
A.A. Freitas. On rule interestingness measures. To appear in Knowledge-Based Systems journal, 1999.
C. Glymour, D. Madigan, D. Pregibon and P. Smyth. Statistical themes and lessons for data mining. Data Mining and Knowl. Discov.1 (1), 11–28. 1997.
B. Liu & W. Hsu. Post-analysis of learned rules. Proc. 1996 Nat. Conf. American Assoc. for Artificial Intelligence (AAAI-96), 828–834. AAAI Press, 1996.
B. Liu, W. Hsu and S. Chen. Using general impressions to analyze discovered classification rules. Proc. 3rd Int. Conf. Knowledge Discovery & Data Mining, 31–36. AAAI, 1997.
G. Newson . Simpson’s paradox revisited. The Mathematical Gazette 75(473), 290–293. Oct. 1991.
B. Padmanabhan and A. Tuzhilin. A belief-driven method for discovering unexpected patterns. Proc. 4th Int. Conf. Knowledge Discovery & Data Mining (KDD-98), 94–100. AAAI Press, 1998.
A. Silberschatz & A. Tuzhilin. What makes patterns interesting in knowledge discovery systems. IEEE Trans. Knowledge & Data Engineering, 8(6), 970–974, Dec./1996.
E.H. Simpson. The interpretation of interaction in contingency tables. Journal of the Royal Statistical Society, Series B, 13, 238–241. 1951.
R. Subramonian . Defining diffas a data mining primitive. Proc. 4th Int. Conf. Knowledge Discovery & Data Mining, 334–338. AAAI, 1998.
E. Suzuki . Autonomous discovery of reliable exception rules. Proc. 3rd Int. Conf. Knowledge Discovery & Data Mining, 259–262. AAAI Press, 1997.
E. Suzuki & Y. Kodratoff. Discovery of surprising exception rules based on intensity of implication. Proc. 2nd European Symp. Principles of Data Mining and Knowledge Discovery (PKDD’98). LNAI1510, 10–18. Nantes, France, Sep. 1998.
C.H. Wagner. Simpson’s paradox in real life. The American Statistician, 36 (1), Feb. 1982, 46–48.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag London Limited
About this paper
Cite this paper
Fabris, C.C., Freitas, A.A. (2000). Discovering Surprising Patterns by Detecting Occurrences of Simpson’s Paradox. In: Bramer, M., Macintosh, A., Coenen, F. (eds) Research and Development in Intelligent Systems XVI. Springer, London. https://doi.org/10.1007/978-1-4471-0745-3_10
Download citation
DOI: https://doi.org/10.1007/978-1-4471-0745-3_10
Publisher Name: Springer, London
Print ISBN: 978-1-85233-231-0
Online ISBN: 978-1-4471-0745-3
eBook Packages: Springer Book Archive