Skip to main content

Tests of Statistical Significance Made Sound

  • Chapter
  • First Online:
Method Matters in Psychology

Part of the book series: Studies in Applied Philosophy, Epistemology and Rational Ethics ((SAPERE,volume 45))

  • 1031 Accesses

Abstract

This chapter considers the nature and place of tests of statistical significance (ToSS) in science, with particular reference to psychology. Despite the enormous amount of attention given to this topic, psychology’s understanding of ToSS remains deficient. The major problem stems from a widespread and uncritical acceptance of null hypothesis significance testing, which is an indefensible amalgam of ideas adapted from Fisher’s thinking on the subject and from Neyman and Pearson’s alternative account. To correct for the deficiencies of the hybrid, it is suggested that psychology avail itself of two important and more recent viewpoints on ToSS, namely the neo-Fisherian and the error-statistical perspectives. It is suggested that these more recent outlooks on ToSS are a definite improvement on standard null hypothesis significance testing. It is concluded that ToSS can play a useful, if limited, role in psychological research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Acree, M. C. (1978). Theories of statistical inference in psychological research: A historico-critical study (University Microfilms No. H790 H7000). Ann Arbor, MI: University Microfilms International.

    Google Scholar 

  • Bolles, R. C. (1962). The difference between statistical hypotheses and scientific hypotheses. Psychological Reports, 11, 639–645.

    Article  Google Scholar 

  • Cox, D. R. (1958). Some problems connected with statistical inference. Annals of Mathematical Statistics, 29, 357–372.

    Article  Google Scholar 

  • Cox, D. R. (2006). Principles of statistical inference. Cambridge, England: Cambridge University Press.

    Book  Google Scholar 

  • Cox, D. R., & Mayo, D. G. (2010). Objectivity and conditionality in frequentist inference. In D. G. Mayo & A. Spanos (Eds.), Error and inference: recent exchanges on experimental reasoning, reliability, and the objectivity and rationality of science (pp. 276–304). New York, NY: Cambridge University Press.

    Google Scholar 

  • Cumming, G. (2014). The new statistics: why and how. Psychological Science, 25, 7–29.

    Article  Google Scholar 

  • Dienes, Z. (2011). Bayesian versus orthodox statistics: Which side are you on? Perspectives on Psychological Science, 6, 274–290.

    Article  Google Scholar 

  • Eich, E. (2014). Business not as usual. Psychological Science, 25, 3–6.

    Article  Google Scholar 

  • Fisher, R. A. (1925). Statistical methods for research workers. Edinburgh, Scotland: Oliver & Boyd.

    Google Scholar 

  • Gelman, A., & Shalizi, C. R. (2013). Philosophy and the practice of Bayesian statistics. British Journal of Mathematical and Statistical Psychology, 66, 8–38.

    Article  Google Scholar 

  • Gigerenzer, G. (1993). The superego, the ego, and the id in statistical reasoning. In G. Keren & C. Lewis (Eds.), A handbook for data analysis in the behavioral sciences (pp. 311–339). Hillsdale, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Grice, J. W. (2011). Observation oriented modeling: analysis of cause in the behavioral sciences. San Diego, CA: Academic Press.

    Google Scholar 

  • Haig, B. D. (2014). Investigating the psychological world: scientific method in the behavioral sciences. Cambridge, MA: MIT Press.

    Book  Google Scholar 

  • Halpin, P. F., & Stam, H. J. (2006). Inductive inference or inductive behavior: Fisher and Neyman-Pearson approaches to statistical testing in psychological research (1940–1960). American Journal of Psychology, 119, 625–653.

    Article  Google Scholar 

  • Harlow, L. L., Mulaik, S. A., & Steiger, J. H. (Eds.). (1997). What if there were no significance tests?. Mahwah, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Harris, R. J. (1997). Reforming significance testing via three-valued logic. In L. L. Harlow, S. A. Mulaik, & J. H. Steiger (Eds.), What if there were no significance tests? (pp. 145–174). Mahwah, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Hoover, K. D., & Siegler, M. V. (2008). Sound and fury: McCloskey and significance testing in economics. Journal of Economic Methodology, 15, 1–37.

    Article  Google Scholar 

  • Hubbard, R. (2004). Alphabet soup: Blurring the distinction between p’s and a’s in psychological research. Theory & Psychology, 14, 295–327.

    Article  Google Scholar 

  • Hubbard, R. (2016). Corrupt research: The case for reconceptualising empirical management and social science. Thousand Oaks, CA: Sage.

    Google Scholar 

  • Hurlbert, S. H., & Lombardi, C. M. (2009). Final collapse of the Neyman-Pearson decision theoretic framework and rise of the neoFisherian. Annales Zoologici Fennici, 46, 311–349.

    Article  Google Scholar 

  • Kaiser, H. F. (1960). Directional statistical decisions. Psychological Review, 67, 160–167.

    Article  Google Scholar 

  • Kruscke, J. (2015). Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan (2nd ed.). Amsterdam, the Netherlands: Elsevier.

    Google Scholar 

  • Lehmann, E. L. (1993). The Fisher, Neyman-Pearson theories of testing hypotheses: One theory or two? Journal of the American Statistical Association, 88, 1242–1249.

    Article  Google Scholar 

  • Lindley, D. V. (2000). The philosophy of statistics. The Statistician, 49, 293–319.

    Google Scholar 

  • Mayo, D. G. (1996). Error and the growth of experimental knowledge. Chicago, IL: University of Chicago Press.

    Book  Google Scholar 

  • Mayo, D. G. (2011). Statistical science and philosophy of science: Where do/should they meet in 2011 (and beyond)? Rationality, Markets and Morals, 2, 79–102.

    Google Scholar 

  • Mayo, D. G. (2012). Statistical science meets philosophy of science, part 2: Shallow versus deep explorations. Rationality, Markets and Morals, 3, 71–107.

    Google Scholar 

  • Mayo, D. G., & Cox, D. (2010). Frequentist statistics as a theory of inductive inference. In D. G. Mayo & A. Spanos (Eds.), Error and inference: Recent exchanges on experimental reasoning, reliability, and the objectivity and rationality of science (pp. 247–304). New York, NY: Cambridge University Press.

    Google Scholar 

  • Mayo, D. G., & Spanos, A. (2006). Severe testing as a basic concept in a Neyman-Pearson philosophy of induction. British Journal for the Philosophy of Science, 57, 323–357.

    Article  Google Scholar 

  • Mayo, D. G., & Spanos, A. (Eds.). (2010). Error and inference: Recent exchanges on experimental reasoning, reliability, and the objectivity and rationality of science. New York, NY: Cambridge University Press.

    Google Scholar 

  • Mayo, D. G., & Spanos, A. (2011). Error statistics. In P. S. Bandyopadhyay & M. R. Forster (Eds.), Handbook of philosophy of Science: Vol. 7. Philosophy of statistics (pp. 153–198). Amsterdam, the Netherlands: Elsevier.

    Chapter  Google Scholar 

  • McCloskey, D. N., & Ziliak, S. T. (1996). The standard error of regressions. Journal of Economic Literature, 34, 97–114.

    Google Scholar 

  • Morrison, D. E., & Henkel, R. E. (Eds.). (1970). The significance test controversy: A reader. Chicago, IL: Aldine.

    Google Scholar 

  • Neyman, J., & Pearson, E. S. (1933). On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London A, 231, 289–337.

    Article  Google Scholar 

  • Nickerson, R. S. (2000). Null hypothesis significance testing: A review of an old and continuing controversy. Psychological Methods, 5, 241–301.

    Article  Google Scholar 

  • Pace, L., & Salvan, A. (1997). Advanced series on statistical science and applied probability: Vol. 4. Principles of statistical inference from a neo-Fisherian perspective. Singapore: World Scientific.

    Google Scholar 

  • Peirce, C. S. (1931–1958). The collected papers of Charles Sanders Peirce (Vols. 1–8; C. Hartshorne & P. Weiss [Eds., Vols. 1–6], & A. W. Burks [Ed., Vols. 7-8]). Cambridge, MA: Harvard University Press.

    Google Scholar 

  • Popper, K. R. (1959). The logic of scientific discovery. London, England: Hutchinson.

    Google Scholar 

  • Senn, S. (2001). Two cheers for P-values? Journal of Epidemiology and Biostatistics, 6, 193–204.

    Article  Google Scholar 

  • Spanos, A. (1999). Probability theory and statistical inference: Economic modeling with observational data. Cambridge, England: Cambridge University Press.

    Book  Google Scholar 

  • Spanos, A. (2010). On a new philosophy of frequentist inference: Exchanges with David Cox and Deborah G. Mayo. In D. G. Mayo & A. Spanos (Eds.), Error and inference: Recent exchanges on experimental reasoning, reliability, and the objectivity and rationality of science (pp. 315–330). New York, NY: Cambridge University Press.

    Google Scholar 

  • Spanos, A. (2014). Recurring controversies about P values and confidence intervals revisited. Ecology, 95, 645–651.

    Article  Google Scholar 

  • Suppes, P. (1962). Models of data. In E. Nagel, P. Suppes, & A. Tarski (Eds.), Logic, methodology, and philosophy of science: Proceedings of the 1960 International Congress (pp. 252–261). Stanford, CA: Stanford University Press.

    Google Scholar 

  • Trafimow, D., & Marks, M. (2015). Editorial. Basic and Applied Social Psychology, 37, 1–2.

    Article  Google Scholar 

  • Van Dyk, D. A. (2014). The role of statistics in the discovery of a Higgs Boson. Annual Review of Statistics and Its Applications, 1, 41–59.

    Article  Google Scholar 

  • Wagenmakers, E.-J. (2007). A practical solution to the pervasive problems of p values. Psychonomic Bulletin & Review, 14, 779–804.

    Article  Google Scholar 

  • Woodward, J. (1989). Data and phenomena. Synthese, 79, 393–472.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brian D. Haig .

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Haig, B.D. (2018). Tests of Statistical Significance Made Sound. In: Method Matters in Psychology. Studies in Applied Philosophy, Epistemology and Rational Ethics, vol 45. Springer, Cham. https://doi.org/10.1007/978-3-030-01051-5_9

Download citation

Publish with us

Policies and ethics