Abstract
Data not suitable for classic parametric statistical analyses arise frequently in human–computer interaction studies. Various nonparametric statistical procedures are appropriate and advantageous when used properly. This chapter organizes and illustrates multiple nonparametric procedures, contrasting them with their parametric counterparts. Guidance is given for when to use nonparametric analyses and how to interpret and report their results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The Mann-Whitney U test has multiple and sometimes confusing names. It is also known as the Wilcoxon-Mann-Whitney test, the Mann-Whitney-Wilcoxon test, and the Wilcoxon rank-sum test. None of these should be confused with the Wilcoxon signed-rank test, which is for one-factor two-level within-subjects designs.
- 2.
Holm’s sequential Bonferroni procedure for three pairwise comparisons uses a significance threshold of \(\upalpha =0.05/3\) for the lowest p-value, \(\upalpha =0.05/2\) for the second lowest p-value, and \(\upalpha =0.05/1\) for the highest p-value. Should a p-value compared in that ascending order fail to be statistically significant, the procedure halts and any subsequent comparisons are regarded as statistically nonsignificant.
- 3.
Rather than using traditional repeated measures ANOVAs, ARTool uses mixed-effects analyses of variance, explained below in the section on Generalized Linear Mixed Models.
- 4.
General Linear Models are often called “linear models” and may be abbreviated “LM.” These should not be confused with Generalized Linear Models, which may be abbreviated “GLM.” However, some texts use “GLM” for linear models and “GZLM” for generalized models. Readers should take care when encountering this family of abbreviations.
- 5.
While not covered in this chapter, LMs and GLMs also offer the ability to use continuous independent variables, not just categorical independent variables (see Chap. 11).
- 6.
Multinomial logistic regression—when used with dichotomous responses such as Yes/No, True/False, Success/Fail, Agree/Disagree, or 1/0—is called “binomial regression.” The GLM for binomial regression uses a “binomial” distribution and “logit” link function. It can be conducted using the glm function in much the same way as Poisson regression explained below, except with the parameter family=binomial.
- 7.
Given data with a large number of zeroes, it is prudent to consider an extension to Poisson regression called “zero-inflated” Poisson regression. This model incorporates binomial regression to predict the probability of a zero alongside Poisson regression to model counts. See the zeroinfl function in the pscl package.
- 8.
Although the canonical link function for the Gamma distribution is actually the “inverse” function, the “log” function is often used because the inverse function can be difficult to estimate due to discontinuity at zero. The two functions provide similar results.
- 9.
This model uses an intercept-only random effect. There are other types of random effects such as slopes-and-intercept random effects that are described in Chap. 11.
- 10.
The ANOVA type indicates how the sums-of-squares are computed. In general, Type III ANOVAs are preferred because they can support conclusions about main effects in the presence of significant interactions. For Type I and Type II ANOVAs, significant main effects cannot safely be interpreted in the presence of significant interactions.
References
Anderson TW, Darling DA (1952) Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes. Ann Math Stat 23(2):193–212
Anderson TW, Darling DA (1954) A test of goodness of fit. J Am Stat Assoc 49(268):765–769
Brown GW, Mood AM (1948) Homogeneity of several samples. Am Stat 2(3):22
Brown GW, Mood AM (1951) On median tests for linear hypotheses. In: Proceedings of the second Berkeley symposium on mathematical statistics and probability, Berkeley, California. University of California Press, Berkeley, California, pp 159–166
Conover WJ, Iman RL (1981) Rank transformations as a bridge between parametric and nonparametric statistics. Am Stat 35(3):124–129
D’Agostino RB (1986) Tests for the normal distribution. In: D’Agostino RB, Stephens MA (eds) Goodness-of-fit techniques. Marcel Dekker, New York, pp 367–420
Dixon WJ, Mood AM (1946) The statistical sign test. J Am Stat Assoc 41(236):557–566
Fawcett RF, Salter KC (1984) A Monte Carlo study of the F test and three tests based on ranks of treatment effects in randomized block designs. Commun Stat Simul Comput 13(2):213–225
Fisher RA (1921) On the “probable error” of a coefficient of correlation deduced from a small sample. Metron 1(4):3–32
Fisher RA (1922) On the interpretation of \(\chi ^{2}\) from contingency tables, and the calculation of P. J R Stat Soc 85(1):87–94
Fisher RA (1925) Statistical methods for research workers. Oliver and Boyd, Edinburgh
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701
Gilmour AR, Anderson RD, Rae AL (1985) The analysis of binomial data by a generalized linear mixed model. Biometrika \(72\)(3):593–599
Greenhouse SW, Geisser S (1959) On methods in the analysis of profile data. Psychometrika 24(2):95–112
Higgins JJ, Blair RC, Tashtoush S (1990) The aligned rank transform procedure. In: Proceedings of the conference on applied statistics in agriculture. Kansas State University, Manhattan, Kansas, pp 185–195
Higgins JJ, Tashtoush S (1994) An aligned rank transform test for interaction. Nonlinear World 1(2):201–211
Higgins JJ (2004) Introduction to modern nonparametric statistics. Duxbury Press, Pacific Grove
Hodges JL, Lehmann EL (1962) Rank methods for combination of independent experiments in the analysis of variance. Ann Math Stat 33(2):482–497
Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6(2):65–70
Kolmogorov A (1933) Sulla determinazione empirica di una legge di distributione. Giornale dell’Istituto Italiano degli Attuari 4:83–91
Kramer CY (1956) Extension of multiple range tests to group means with unequal numbers of replications. Biometrics 12(3):307–310
Kruskal WH, Wallis WA (1952) Use of ranks in one-criterion variance analysis. J Amer Stat Assoc 47(260):583–621
Lehmann EL (2006) Nonparametrics: statistical methods based on ranks. Springer, New York
Levene H (1960) Robust tests for equality of variances. In: Olkin I, Ghurye SG, Hoeffding H, Madow WG, Mann HB (eds) Contributions to probability and statistics. Stanford University Press, Palo Alto, pp 278–292
Liang K-Y, Zeger SL (1986) Longitudinal data analysis using generalized linear models. Biometrika 73(1):13–22
Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18(1):50–60
Mansouri H (1999a) Aligned rank transform tests in linear models. J Stat Plann Inference 79(1):141–155
Mansouri H (1999b) Multifactor analysis of variance based on the aligned rank transform technique. Comput Stat Data Anal 29(2):177–189
Mansouri H, Paige RL, Surles JG (2004) Aligned rank transform techniques for analysis of variance and multiple comparisons. Commun Stat Theory Methods 33(9):2217–2232
Massey FJ (1951) The Kolmogorov-Smirnov test for goodness of fit. J Am Stat Assoc 46(253):68–78
Mauchly JW (1940) Significance test for sphericity of a normal n-variate distribution. Ann Math Stat 11(2):204–209
McCullagh P (1980) Regression models for ordinal data. J R Stat Soc Ser B 42(2):109–142
Mehta CR, Patel NR (1983) A network algorithm for performing Fisher’s exact test in r \(\times \) c contingency tables. J Am Stat Assoc 78(382):427–434
Nelder JA, Wedderburn RWM (1972) Generalized linear models. J R Stat Soc Ser A \(135\)(3):370–384
Pearson K (1900) On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philos Mag Ser 5 50(302):157–175
Razali NM, Wah YB (2011) Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. J Stat Model Anal \(2\)(1):21–33
Richter SJ (1999) Nearly exact tests in factorial experiments using the aligned rank transform. J Appl Stat \(26\)(2):203–217
Salter KC, Fawcett RF (1985) A robust and powerful rank test of treatment effects in balanced incomplete block designs. Commun Stat Simul Comput \(14\)(4):807–828
Salter KC, Fawcett RF (1993) The ART test of interaction: a robust and powerful rank test of interaction in factorial models. Commun Stat Simul Comput \(22\)(1):137–153
Sawilowsky SS (1990) Nonparametric tests of interaction in experimental design. Rev Educ Res \(60\)(1):91–126
Shapiro SS, Wilk MB (1965) An analysis of variance test for normality (complete samples). Biometrika \(52\)(3, 4):591–611
Smirnov H (1939) Sur les écarts de la courbe de distribution empirique. Recueil Mathématique (Matematiceskii Sbornik) 6:3–26
Sokal RR, Rohlf FJ (1981) Biometry: the principles and practice of statistics in biological research. W. H. Freeman, Oxford
Stewart WM (1941) A note on the power of the sign test. Ann Math Stat \(12\)(2):236–239
Stiratelli R, Laird N, Ware JH (1984) Random-effects models for serial observations with binary response. Biometrics 40(4):961–971
Student (1908) The probable error of a mean. Biometrika \(6\)(1):1–25
Tukey JW (1949) Comparing individual means in the analysis of variance. Biometrics 5(2):99–114
Tukey JW (1953) The problem of multiple comparisons. Princeton University, Princeton
von Bortkiewicz L (1898) Das Gesetz der kleinen Zahlen (The law of small numbers). Druck und Verlag von B.G. Teubner, Leipzig
Wald A (1943) Tests of statistical hypotheses concerning several parameters when the number of observations is large. Trans Amer Math Soc \(54\)(3):426–482
Welch BL (1951) On the comparison of several mean values: an alternative approach. Biometrika \(38\)(3/4):330–336
White H (1980) A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48(4):817–838
Wilcoxon F (1945) Individual comparisons by ranking methods. Biomet Bull 1(6):80–83
Wobbrock JO, Findlater L, Gergle D, Higgins JJ (2011) The Aligned Rank Transform for nonparametric factorial analyses using only ANOVA procedures. In: Proceedings of the ACM conference on human factors in computing systems (CHI ’11), Vancouver, British Columbia, 7–12 May 2011. ACM Press, New York, pp 143–146
Zeger SL, Liang K-Y, Albert PS (1988) Models for longitudinal data: a generalized estimating equation approach. Biometrics 44(4):1049–1060
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Wobbrock, J.O., Kay, M. (2016). Nonparametric Statistics in Human–Computer Interaction. In: Robertson, J., Kaptein, M. (eds) Modern Statistical Methods for HCI. Human–Computer Interaction Series. Springer, Cham. https://doi.org/10.1007/978-3-319-26633-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-26633-6_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26631-2
Online ISBN: 978-3-319-26633-6
eBook Packages: Computer ScienceComputer Science (R0)