Beyond Simpson’s Paradox: One Problem in Data Science

  • Chikio Hayashi
  • Kazue Yamaoka
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


In the present paper, the conditions under which Simpson’s paradox does not occur are discussed for various cases. These conditions are first obtained from the descriptive point of view and then on the assumption of prior probability distributions of parameters. The robustness of the results is discussed with respect to the prior probability distributions. Practically, the result is given as the magnitude of odds ratio (or relative risk), i.e., Simpson’s paradox does not occur if the odds ratio is more or less than a certain values, depending on various cases.


Simpson’s paradox conditions of non-paradox descriptive approach prior probability distribution of a parameter robustness of solution Monte Carlo solution 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Geng, Z.H.I. (1992). Collapsibility of relative risk in contingency tables with a response variable, J R Statist Soc B, 54, 585–593.Google Scholar
  2. Hand, D.J. (1994). Deconstruction statistical questions, J R Statist Soc, A157, 317–356.CrossRefGoogle Scholar
  3. Hayashi, C. (1993). Treatise on behaviormetrics (in Japanese), Asakura-Syoten, 108–122.Google Scholar
  4. Hintsman, D.L. (1993). On variability, Simpson’s paradox, and the relation between recognition and recall: reply to Tulving and Flexser, Psychol Rev, 100, 143–148.CrossRefGoogle Scholar
  5. Miettinen, O.S. (1976). Stratification by multivariate confounder score, Am J Epidemiol, 104 (6), 609–620.Google Scholar
  6. Shapiro, S.H. (1982). Collapsing contingency tables: a geometric approach. Am Statistn, 36, 43–46.CrossRefGoogle Scholar
  7. Simpson, E.H. (1951). The interpretation of interaction in contingency tales, J R Statist Soc, B13, 238–241.Google Scholar
  8. Vogt, A. (1995). Simpson’s paradox revisited, Student, 1, 2, 99–108.Google Scholar
  9. Weinberg, C.R. (1993). Toward a clear definition of confounding, Am J Epidemiol, 137, 1, 1–8.Google Scholar
  10. Wermuth, N. (1989). Moderating effects of subgroups in linear models, Biometrika, 76, 81–92.CrossRefGoogle Scholar
  11. Yamaoka, K. (1996). Beyond Simpson’s Paradox: A descriptive approach, Data Analysis and Stochastic Models, 12, 23 9–253.Google Scholar

Copyright information

© Springer-Verlag Berlin · Heidelberg 1998

Authors and Affiliations

  • Chikio Hayashi
    • 1
  • Kazue Yamaoka
    • 2
  1. 1.Institute of Statistical MathematicsJapan
  2. 2.Department of Hygiene and Public Health, School of MedicineTeikyo UniversityJapan

Personalised recommendations