Beyond Simpson’s Paradox: One Problem in Data Science
In the present paper, the conditions under which Simpson’s paradox does not occur are discussed for various cases. These conditions are first obtained from the descriptive point of view and then on the assumption of prior probability distributions of parameters. The robustness of the results is discussed with respect to the prior probability distributions. Practically, the result is given as the magnitude of odds ratio (or relative risk), i.e., Simpson’s paradox does not occur if the odds ratio is more or less than a certain values, depending on various cases.
KeywordsSimpson’s paradox conditions of non-paradox descriptive approach prior probability distribution of a parameter robustness of solution Monte Carlo solution
Unable to display preview. Download preview PDF.
- Geng, Z.H.I. (1992). Collapsibility of relative risk in contingency tables with a response variable, J R Statist Soc B, 54, 585–593.Google Scholar
- Hayashi, C. (1993). Treatise on behaviormetrics (in Japanese), Asakura-Syoten, 108–122.Google Scholar
- Miettinen, O.S. (1976). Stratification by multivariate confounder score, Am J Epidemiol, 104 (6), 609–620.Google Scholar
- Simpson, E.H. (1951). The interpretation of interaction in contingency tales, J R Statist Soc, B13, 238–241.Google Scholar
- Vogt, A. (1995). Simpson’s paradox revisited, Student, 1, 2, 99–108.Google Scholar
- Weinberg, C.R. (1993). Toward a clear definition of confounding, Am J Epidemiol, 137, 1, 1–8.Google Scholar
- Yamaoka, K. (1996). Beyond Simpson’s Paradox: A descriptive approach, Data Analysis and Stochastic Models, 12, 23 9–253.Google Scholar