Quality of Life Research

, Volume 23, Issue 1, pp 5–7 | Cite as

Significance, truth and proof of p values: reminders about common misconceptions regarding null hypothesis significance testing

  • Mathilde G. E. Verdam
  • Frans J. Oort
  • Mirjam A. G. Sprangers

Null hypothesis significance testing has successfully reduced the complexity of scientific inference to a dichotomous decision (i.e., ‘reject’ versus ‘not reject’). As a consequence, p values and their associated statistical significance play an important role in the social and medical sciences. But do we truly understand what statistical significance testing and p values entail? Judging by the vast literature on controversies regarding their application and interpretation, this seems questionable. It has even been argued that significance testing should be abandoned all together [2]. We seek to extend Fayer’s [3] paper on statistically significant correlations and to clarify some of the controversies regarding statistical significance testing by explaining that (1) the pvalue is not the probability of the null hypothesis; (2) rejecting the null hypothesis does not prove that the alternative hypothesis is true; (3) not rejecting the null hypothesis does not prove that the alternative...


Null Hypothesis Significance Testing Alternative Hypothesis Minimal Important Difference Effect Size Estimate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Baken, D. (1966). The test of significance in psychological research. Psychological Bulletin, 66, 1–29.CrossRefGoogle Scholar
  2. 2.
    Carver, R. (1978). The case against statistical significance testing. Harvard Educational Review, 48, 378–399.Google Scholar
  3. 3.
    Fayers, P. M. (2008). The scales were highly correlated: p = 0.001. Quality of Life Research, 17, 651–652.PubMedCrossRefGoogle Scholar
  4. 4.
    Nickerson, R. S. (2000). Null hypothesis significance testing: A review of an old and continuing controversy. Psychological Methods, 5, 241–301.PubMedCrossRefGoogle Scholar
  5. 5.
    Revicki, D., Hays, R. D., Cella, D., & Sloan, J. (2008). Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes. Journal of Clinical Epidemiology, 61, 102–109.PubMedCrossRefGoogle Scholar
  6. 6.
    Wilkinson, L., & APA Task Force on Statistical Inference. (1999). Statistical methods in psychology journals. American Psychologist, 54, 594–604.CrossRefGoogle Scholar
  7. 7.
    Norman, G. R., Sloan, J. A., & Wyrwich, K. W. (2003). Interpretation of changes in health-related quality of life: The remarkable universality of half a standard deviation. Medical Care, 41, 582–592.PubMedGoogle Scholar
  8. 8.
    Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.Google Scholar
  9. 9.
    Glaser, D. N. (1999). The controversy of significance testing: Misconceptions and alternatives. American Journal of Critical Care, 8, 291–296.PubMedGoogle Scholar
  10. 10.
    Cohen, J. (1994). The earth is round (p < .05). American Psychologists, 49, 997–1003.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  • Mathilde G. E. Verdam
    • 1
    • 2
  • Frans J. Oort
    • 1
    • 2
  • Mirjam A. G. Sprangers
    • 2
  1. 1.Department of Child Development and EducationUniversity of AmsterdamAmsterdamThe Netherlands
  2. 2.Department of Medical Psychology, Academic Medical CentreUniversity of AmsterdamAmsterdamThe Netherlands

Personalised recommendations