Skip to main content

Discovering Surprising Patterns by Detecting Occurrences of Simpson’s Paradox

  • Conference paper
Research and Development in Intelligent Systems XVI

Abstract

This paper addresses the discovery of surprising patterns. Recently, several authors have addressed the task of discovering surprising prediction rules. However, we do not focus on prediction rules, but rather on a quite different kind of pattern, namely the occurrence of Simpson’s paradox. Intuitively, the fact that this is a paradox suggests that it has a great potential to be a surprising pattern for the user. With this motivation, we make the detection of Simpson’s paradox the central goal of a data mining algorithm explicitly designed to discover surprising patterns. We present computational results showing surprising occurrences of the paradox in some public-domain data sets. In addition, we propose a method for ranking the discovered instances of the paradox in decreasing order of estimated degree of surprisingness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. G. Dong & J. Li. Interestingness of discovered association rules in terms of neighborhood-based unexpectedness. Research and Development in Knowledge Discovery & Data Mining (Proa 2 nd Pacific-Asian Conf., PAKDD-98). LNAI 1394, 72–86. Springer-Verlag, 1998.

    Google Scholar 

  2. A.A. Freitas . On objective measures of rule surprisingness. Principles of Data Mining and Knowledge Discovery: Proc. 2nd European Symp. (PKDD’98). LNAI 1510, 1–9. Nantes, France, Sep. 1998.

    Chapter  Google Scholar 

  3. A.A. Freitas. On rule interestingness measures. To appear in Knowledge-Based Systems journal, 1999.

    Google Scholar 

  4. C. Glymour, D. Madigan, D. Pregibon and P. Smyth. Statistical themes and lessons for data mining. Data Mining and Knowl. Discov.1 (1), 11–28. 1997.

    Article  Google Scholar 

  5. B. Liu & W. Hsu. Post-analysis of learned rules. Proc. 1996 Nat. Conf. American Assoc. for Artificial Intelligence (AAAI-96), 828–834. AAAI Press, 1996.

    Google Scholar 

  6. B. Liu, W. Hsu and S. Chen. Using general impressions to analyze discovered classification rules. Proc. 3rd Int. Conf. Knowledge Discovery & Data Mining, 31–36. AAAI, 1997.

    Google Scholar 

  7. G. Newson . Simpson’s paradox revisited. The Mathematical Gazette 75(473), 290–293. Oct. 1991.

    Google Scholar 

  8. B. Padmanabhan and A. Tuzhilin. A belief-driven method for discovering unexpected patterns. Proc. 4th Int. Conf. Knowledge Discovery & Data Mining (KDD-98), 94–100. AAAI Press, 1998.

    Google Scholar 

  9. A. Silberschatz & A. Tuzhilin. What makes patterns interesting in knowledge discovery systems. IEEE Trans. Knowledge & Data Engineering, 8(6), 970–974, Dec./1996.

    Article  Google Scholar 

  10. E.H. Simpson. The interpretation of interaction in contingency tables. Journal of the Royal Statistical Society, Series B, 13, 238–241. 1951.

    MathSciNet  MATH  Google Scholar 

  11. R. Subramonian . Defining diffas a data mining primitive. Proc. 4th Int. Conf. Knowledge Discovery & Data Mining, 334–338. AAAI, 1998.

    Google Scholar 

  12. E. Suzuki . Autonomous discovery of reliable exception rules. Proc. 3rd Int. Conf. Knowledge Discovery & Data Mining, 259–262. AAAI Press, 1997.

    Google Scholar 

  13. E. Suzuki & Y. Kodratoff. Discovery of surprising exception rules based on intensity of implication. Proc. 2nd European Symp. Principles of Data Mining and Knowledge Discovery (PKDD’98). LNAI1510, 10–18. Nantes, France, Sep. 1998.

    Chapter  Google Scholar 

  14. C.H. Wagner. Simpson’s paradox in real life. The American Statistician, 36 (1), Feb. 1982, 46–48.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag London Limited

About this paper

Cite this paper

Fabris, C.C., Freitas, A.A. (2000). Discovering Surprising Patterns by Detecting Occurrences of Simpson’s Paradox. In: Bramer, M., Macintosh, A., Coenen, F. (eds) Research and Development in Intelligent Systems XVI. Springer, London. https://doi.org/10.1007/978-1-4471-0745-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-0745-3_10

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-85233-231-0

  • Online ISBN: 978-1-4471-0745-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics