# Normal and Non-normal Data Simulations for the Evaluation of Two-Sample Location Tests

## Abstract

Two-sample location tests refer to the family of statistical tests that compare two independent distributions via measures of central tendency, most commonly means or medians. The *t*-test is the most recognized parametric option for two-sample mean comparisons. The pooled *t*-test assumes the two population variances are equal. Under circumstances where the two population variances are unequal, Welch’s *t*-test is a more appropriate test. Both of these *t*-tests require data to be normally distributed. If the normality assumption is violated, a non-parametric alternative such as the Wilcoxon rank-sum test has potential to maintain adequate type I error and appreciable power . While sometimes considered controversial, pretesting for normality followed by the *F*-test for equality of variances may be applied before selecting a two-sample location test. This option results in multi-stage tests as another alternative for two-sample location comparisons, starting with a normality test, followed by either Welch’s *t*-test or the Wilcoxon rank-sum test. Less commonly utilized alternatives for two-sample location comparisons include permutation tests, which evaluate statistical significance based on empirical distributions of test statistics. Overall, a variety of statistical tests are available for two-sample location comparisons. Which tests demonstrate the best performance in terms of type I error and power depends on variations in data distribution, population variance, and sample size. One way to evaluate these tests is to simulate data that mimic what might be encountered in practice. In this chapter, the use of Monte Carlo techniques are demonstrated to simulate normal and non-normal data for the evaluation of two-sample location tests.

## References

- Altman, D. G., & Royston, P. (2006). The cost of dichotomising continuous variables.
*Bmj*,*332*(7549), 1080.CrossRefGoogle Scholar - Beasley, T. M., Erickson, S., & Allison, D. B. (2009). Rank-based inverse normal transformations are increasingly used, but are they merited?
*Behavior Genetics*,*39*(5), 580–595.CrossRefGoogle Scholar - Boik, R. J. (1987). The fisher-pitman permutation test: A non-robust alternative to the normal theory f test when variances are heterogeneous.
*British Journal of Mathematical and Statistical Psychology*,*40*(1), 26–42.MathSciNetCrossRefzbMATHGoogle Scholar - Cohen, J. (2013).
*Statistical power analysis for the behavioral sciences*. Academic press.Google Scholar - de Winter, J. C. (2013). Using the students t-test with extremely small sample sizes.
*Practical Assessment, Research & Evaluation*,*18*(10), 1–12.Google Scholar - Demirtas, H., Hedeker, D., & Mermelstein, R. J. (2012). Simulation of massive public health data by power polynomials.
*Statistics in Medicine*,*31*(27), 3337–3346.MathSciNetCrossRefGoogle Scholar - Devroye, L. (1986). Sample-based non-uniform random variate generation. In
*Proceedings of the 18th conference on Winter simulation*, pp. 260–265. ACM.Google Scholar - Ernst, M. D., et al. (2004). Permutation methods: a basis for exact inference.
*Statistical Science*,*19*(4), 676–685.MathSciNetCrossRefzbMATHGoogle Scholar - Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G* power 3: A flexible statistical power ana lysis program for the social, behavioral, and biomedical sciences.
*Behavior Research Methods*,*39*(2), 175–191.CrossRefGoogle Scholar - Headrick, T. C., & Sawilowsky, S. S. (2000). Weighted simplex procedures for determining boundary points and constants for the univariate and multivariate power methods.
*Journal of Educational and Behavioral Statistics*,*25*(4), 417–436.CrossRefGoogle Scholar - Kohr, R. L., & Games, P. A. (1974). Robustness of the analysis of variance, the welch procedure and a box procedure to heterogeneous variances.
*The Journal of Experimental Education*,*43*(1), 61–69.CrossRefGoogle Scholar - Marrero, O. (1985). Robustness of statistical tests in the two-sample location problem.
*Biometrical Journal*,*27*(3), 299–316.MathSciNetCrossRefGoogle Scholar - Osborne, J. (2005). Notes on the use of data transformations.
*Practical Assessment, Research and Evaluation*,*9*(1), 42–50.Google Scholar - Osborne, J. W. (2010). Improving your data transformations: Applying the box-cox transformation.
*Practical Assessment, Research & Evaluation*,*15*(12), 1–9.Google Scholar - Rasch, D., Kubinger, K. D., & Moder, K. (2011). The two-sample t-test: pre-testing its assumptions does not pay off.
*Statistical Papers*,*52*(1), 219–231.Google Scholar - Rochon, J., Gondan, M., & Kieser, M. (2012). To test or not to test: Preliminary assessment of normality when comparing two independent samples.
*BMC Medical Research Methodology*,*12*(1), 81.CrossRefGoogle Scholar - Royston, J. (1982). An extension of shapiro and wilk’s w test for normality to large samples.
*Applied Statistics*, 115–124.Google Scholar - Ruxton, G. D. (2006). The unequal variance t-test is an underused alternative to student’s t-test and the mann-whitney u test.
*Behavioral Ecology*,*17*(4), 688–690.CrossRefGoogle Scholar - Sawilowsky, S. S. (2005). Misconceptions leading to choosing the t-test over the wilcoxon mann-whitney test for shift in location parameter.Google Scholar
- Schucany, W. R., & Tony Ng H. (2006). Preliminary goodness-of-fit tests for normality do not validate the one-sample student t.
*Communications in Statistics Theory and Methods*,*35*(12), 2275–2286.Google Scholar - Team, R. C. (2014). R: A language and environment for statistical computing. R foundation for statistical computing, vienna, austria, 2012.Google Scholar
- Welch, B. L. (1938). The significance of the difference between two means when the population variances are unequal.
*Biometrika*,*29*(3–4), 350–362.CrossRefzbMATHGoogle Scholar - Zimmerman, D. W. (1996). A note on homogeneity of variance of scores and ranks.
*The Journal of Experimental Education*,*64*(4), 351–362.CrossRefGoogle Scholar - Zimmerman, D. W. (1998). Invalidation of parametric and nonparametric statistical tests by concurrent violation of two assumptions.
*The Journal of Experimental Education*,*67*(1), 55–68.CrossRefGoogle Scholar - Zimmerman, D. W. (2004). A note on preliminary tests of equality of variances.
*British Journal of Mathematical and Statistical Psychology*,*57*(1), 173–181.MathSciNetCrossRefGoogle Scholar - Zimmerman, D. W., & Zumbo, B. D. (1993). Rank transformations and the power of the student t-test and welch t’test for non-normal populations with unequal variances.
*Canadian Journal of Experimental Psychology/Revue canadienne de psychologie expérimentale*,*47*(3), 523.Google Scholar