Skip to main content

The p-value Case, a Review of the Debate: Issues and Plausible Remedies

  • Conference paper
  • First Online:
Studies in Theoretical and Applied Statistics (SIS 2016)

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 227))

Included in the following conference series:

  • 1017 Accesses

Abstract

We review the recent debate on the lack of reliability of scientific results and its connections to the statistical methodologies at the core of the discovery paradigm. Null hypotheses statistical testing, in particular, has often been related to, if not blamed for, the present situation. We argue that a loose relation exists: although NHST, if properly used, could not be seen as a cause, some common misuses may mask or even favour bad practices leading to the lack of reliability. We discuss various proposals which have been put forward to deal with these issues.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Baker, M.: Is there a reproducibility crisis? Nature 533, 452–454 (2016)

    Article  Google Scholar 

  2. Beall, A.T., Tracy, J.L.: Women are more likely to wear red or pink at peak fertility. Psychol. Sci. 24, 1837–1841 (2013)

    Article  Google Scholar 

  3. Berger, J.O.: Could Fisher, Jeffreys and Neyman have agreed on testing? Stat. Sci. 18(1), 1–12 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  4. Boland, M.R., Shahn, Z., Madigan, D., Hripcsak, G., Tatonetti, N.P.: Birth month affects lifetime disease risk: a phenome-wide method. J. Am. Med. Inform. Assoc. ocv046 (2015)

    Google Scholar 

  5. Brodeur, A., Lé, M., Sangnier, M., Zylberberg, Y.: Star wars: the empirics strike back. Am. Econ. J. Appl. Econ. 8(1), 1–32 (2016)

    Article  Google Scholar 

  6. Burnham, K., Anderson, D.: P values are only an index to evidence: 20th-vs. 21st-century statistical science. Ecology 95(3), 627–630 (2014)

    Article  Google Scholar 

  7. Cohen, J.: The earth is round (\(p\,<\,0.05\)). Am. Psychol. 49, 997–1003 (1994)

    Article  Google Scholar 

  8. Cowan, G., Cranmer, K., Gross, E., Vitells, O.: Asymptotic formulae for likelihood-based tests of new physics. Eur. Phys. J. C 71(2), 1–19 (2011)

    Article  Google Scholar 

  9. Cowen, R.: Big bang finding challenged. Nature 510(7503), 20 (2014)

    Article  Google Scholar 

  10. Cumming, G.: The new statistics why and how. Psychol. Sci. 25, 7–29 (2013)

    Article  Google Scholar 

  11. Fidler, F., Loftus, G.R.: Why figures with error bars should replace p values: some conceptual arguments and empirical demonstrations. J. Psychol. 217(1), 27–37 (2009)

    Google Scholar 

  12. Fisher, R.A., et al.: Statistical methods for research workers. In: Statistical Methods for Research Workers, 10th. edn. (1946)

    Google Scholar 

  13. Gelman, A.: Commentary: P values and statistical practice. Epidemiology 24(1), 69–72 (2013)

    Article  MathSciNet  Google Scholar 

  14. Gelman, A., Loken, E.: The statistical crisis in science. Am. Sci. 102, 460–465 (2014)

    Article  Google Scholar 

  15. Gigerenzer, G.: Mindless statistics. J. Socio-Econ. 33(5), 587–606 (2004)

    Article  Google Scholar 

  16. Goodman, S.N.: Toward evidence-based medical statistics. 1: the p value fallacy. Ann. Intern. Med. 130(12), 995–1004 (1999)

    Article  Google Scholar 

  17. Goodman, S.N.: Toward evidence-based medical statistics. 2: the bayes factor. Ann. Intern. Med. 130(12), 1005–1013 (1999)

    Article  Google Scholar 

  18. Goodman, S.N.: Aligning statistical and scientific reasoning. Science 352, 1180–1181 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  19. Greenland, S., Poole, C.: Living with p values: resurrecting a bayesian perspective on frequentist statistics. Epidemiology 24(1), 62–68 (2013)

    Article  Google Scholar 

  20. Hart, et al.: Dogs are sensitive to small variations of the Earth’s magnetic field. Front. Zool. 10, 80 (2013)

    Google Scholar 

  21. Hauer, E.: The harm done by tests of significance. Accident Analysis & Prevention 36(3), 495–500 (2004)

    Article  Google Scholar 

  22. Head, M.L., Holman, L., Lanfear, R., Kahn, A.T., Jennions, M.D.: The extent and consequences of p-hacking in science. PLoS Biol. 13(3), e1002,106 (2015)

    Google Scholar 

  23. Hoover, K.D., Siegler, M.V.: Sound and fury: Mccloskey and significance testing in economics. J. Econ. Method. 15(1), 1–37 (2008)

    Article  Google Scholar 

  24. Ioannidis, J.P.: Contradicted and initially stronger effects in highly cited clinical research. Jama 294(2), 218–228 (2005)

    Article  MathSciNet  Google Scholar 

  25. Ioannidis, J.P.: Why most published research findings are false. PLoS Med. 2(8), e124 (2005)

    Article  Google Scholar 

  26. Kaplan, R.M., Irvin, V.L.: Likelihood of null effects of large nhlbi clinical trials has increased over time. PloS one 10(8), e0132,382 (2015)

    Google Scholar 

  27. Klein, J.R., Roodman, A.: Blind analysis in nuclear and particle physics. Ann. Rev. Nucl. Part. Sci. 55(1), 141–163 (2005)

    Article  Google Scholar 

  28. Krantz, D.H.: The null hypothesis testing controversy in psychology. J. Am. Stat. Assoc. 94(448), 1372–1381 (1999)

    Article  Google Scholar 

  29. Leek, J.T., Peng, R.D.: Statistics: P-values are just the tip of the iceberg. Nature 520(7549) (2015)

    Google Scholar 

  30. Lovell, D.: Biological importance and statistical significance. J. Agric. Food Chem. 61(35), 8340–8348 (2013)

    Article  Google Scholar 

  31. MacCoun, R., Perlmutter, S.: Blind analysis: hide results to seek the truth. Nature 526(7572), 187–189 (2015)

    Article  Google Scholar 

  32. Masicampo, E.J., Lalande, D.R.: A peculiar prevalence of p-values just below.05. Q. J. Exp. Psychol. 65(11), 2271–2279 (2012)

    Article  Google Scholar 

  33. Mayo, D.G., Spanos, A.: Severe testing as a basic concept in a neymanpearson philosophy of induction. Br. J. Philos. Sci. 57(2), 323–357 (2006)

    Article  MATH  Google Scholar 

  34. McCloskey, D.: The insignificance of statistical significance. Sci. Am. 272, 32–33 (1995)

    Article  Google Scholar 

  35. McCloskey, D.N., Ziliak, S.T.: The standard error of regressions. J. Econ. Lit. 34(1), 97–114 (1996)

    Google Scholar 

  36. Meehl, P.: The problem is epistemology, not statistics: replace significance tests by confidence intervals and quantify accuracy of risky numerical predictions. In: What if there were no significance tests, pp. 393–425. Psychology press (2013)

    Google Scholar 

  37. Neyman, J., Pearson, E.S.: On the problem of the most efficient tests of statistical hypotheses. Philos. Trans. R. Soc. Lon. Ser. A 231, 289–337 (1933)

    Google Scholar 

  38. Nicholls, N.: Commentary and analysis: the insignificance of significance testing. Bull. Am. Meteorol. Soc. 82(5), 981–986 (2001)

    Article  Google Scholar 

  39. Nuzzo, R.: Scientific method: statistical errors. Nature 506(7487), 150–152 (2014)

    Article  Google Scholar 

  40. Reich, E.S.: Timing glitches dog neutrino claim. Nature 483(7387), 17 (2012)

    Article  Google Scholar 

  41. Rogoff, K., Reinhart, C.: Growth in a time of debt. Am. Econ. Rev. 100, 573–578 (2010)

    Article  Google Scholar 

  42. Rothman, K.J.: Writing for epidemiology. Epidemiology 9(3), 333–337 (1998)

    Article  Google Scholar 

  43. Royall, R.: Statistical Evidence: A Likelihood Paradigm (Chapman & Hall/CRC Monographs on Statistics & Applied Probability). Chapman and Hall/CRC (1997)

    Google Scholar 

  44. Schmidt, F., Hunter, J.: Eight common but false objections to the discontinuation of significance testing in the analysis of research data. In: S.A.S.J. Harlow L.L. (ed.) What if There were no Significance Tests?, pp. 37–64. Psychology Press (1997)

    Google Scholar 

  45. Simmons, J.P., Nelson, L.D., Simonsohn, U.: False-Positive psychology-undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychol. Sci. 22(11), 1359–1366 (2011)

    Article  Google Scholar 

  46. Simonsohn, U., Nelson, L.D., Simmons, J.P.: P-curve: a key to the file-drawer. J. Exp. Psychol. Gen. 143(2), 534–547 (2014)

    Article  Google Scholar 

  47. Sterne, J.A.C., Smith, G.D., Cox, D.R.: Sifting the evidence-what’s wrong with significance tests? Phys. Ther. 81(8), 1464–1469 (2001)

    Article  Google Scholar 

  48. Trafimow, D.: Editorial. Basic Appl. Soc. Psychol. 36(1), 1–2 (2014)

    Google Scholar 

  49. Trafimow, D., Marks, M.: Editorial. Basic Appl. Soc. Psychol. 37(1), 1–2 (2015)

    Google Scholar 

  50. Wagenmakers, E.J.J.: A practical solution to the pervasive problems of p values. Psychon. Bull. Rev. 14(5), 779–804 (2007)

    Article  Google Scholar 

  51. Wasserstein, R.L., Lazar, N.A.: The ASA’s statement on p-values: context, process, and purpose. Am. Stat. 70(2), 129–133 (2016)

    Article  MathSciNet  Google Scholar 

  52. Ziliak, S., McCloskey, D.: Size matters: the standard error of regressions in the american economic review. J. Socio-Econ. 33(5), 527–546 (2004)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by Univesity of Trieste within the FRA project “Politiche strutturali e riforme. Analisi degli indicatori e valutazione degli effetti”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Pauli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pauli, F. (2018). The p-value Case, a Review of the Debate: Issues and Plausible Remedies. In: Perna, C., Pratesi, M., Ruiz-Gazen, A. (eds) Studies in Theoretical and Applied Statistics. SIS 2016. Springer Proceedings in Mathematics & Statistics, vol 227. Springer, Cham. https://doi.org/10.1007/978-3-319-73906-9_9

Download citation

Publish with us

Policies and ethics