Abstract
This chapter considers the nature and place of tests of statistical significance (ToSS) in science, with particular reference to psychology. Despite the enormous amount of attention given to this topic, psychology’s understanding of ToSS remains deficient. The major problem stems from a widespread and uncritical acceptance of null hypothesis significance testing, which is an indefensible amalgam of ideas adapted from Fisher’s thinking on the subject and from Neyman and Pearson’s alternative account. To correct for the deficiencies of the hybrid, it is suggested that psychology avail itself of two important and more recent viewpoints on ToSS, namely the neo-Fisherian and the error-statistical perspectives. It is suggested that these more recent outlooks on ToSS are a definite improvement on standard null hypothesis significance testing. It is concluded that ToSS can play a useful, if limited, role in psychological research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Acree, M. C. (1978). Theories of statistical inference in psychological research: A historico-critical study (University Microfilms No. H790 H7000). Ann Arbor, MI: University Microfilms International.
Bolles, R. C. (1962). The difference between statistical hypotheses and scientific hypotheses. Psychological Reports, 11, 639–645.
Cox, D. R. (1958). Some problems connected with statistical inference. Annals of Mathematical Statistics, 29, 357–372.
Cox, D. R. (2006). Principles of statistical inference. Cambridge, England: Cambridge University Press.
Cox, D. R., & Mayo, D. G. (2010). Objectivity and conditionality in frequentist inference. In D. G. Mayo & A. Spanos (Eds.), Error and inference: recent exchanges on experimental reasoning, reliability, and the objectivity and rationality of science (pp. 276–304). New York, NY: Cambridge University Press.
Cumming, G. (2014). The new statistics: why and how. Psychological Science, 25, 7–29.
Dienes, Z. (2011). Bayesian versus orthodox statistics: Which side are you on? Perspectives on Psychological Science, 6, 274–290.
Eich, E. (2014). Business not as usual. Psychological Science, 25, 3–6.
Fisher, R. A. (1925). Statistical methods for research workers. Edinburgh, Scotland: Oliver & Boyd.
Gelman, A., & Shalizi, C. R. (2013). Philosophy and the practice of Bayesian statistics. British Journal of Mathematical and Statistical Psychology, 66, 8–38.
Gigerenzer, G. (1993). The superego, the ego, and the id in statistical reasoning. In G. Keren & C. Lewis (Eds.), A handbook for data analysis in the behavioral sciences (pp. 311–339). Hillsdale, NJ: Lawrence Erlbaum.
Grice, J. W. (2011). Observation oriented modeling: analysis of cause in the behavioral sciences. San Diego, CA: Academic Press.
Haig, B. D. (2014). Investigating the psychological world: scientific method in the behavioral sciences. Cambridge, MA: MIT Press.
Halpin, P. F., & Stam, H. J. (2006). Inductive inference or inductive behavior: Fisher and Neyman-Pearson approaches to statistical testing in psychological research (1940–1960). American Journal of Psychology, 119, 625–653.
Harlow, L. L., Mulaik, S. A., & Steiger, J. H. (Eds.). (1997). What if there were no significance tests?. Mahwah, NJ: Lawrence Erlbaum.
Harris, R. J. (1997). Reforming significance testing via three-valued logic. In L. L. Harlow, S. A. Mulaik, & J. H. Steiger (Eds.), What if there were no significance tests? (pp. 145–174). Mahwah, NJ: Lawrence Erlbaum.
Hoover, K. D., & Siegler, M. V. (2008). Sound and fury: McCloskey and significance testing in economics. Journal of Economic Methodology, 15, 1–37.
Hubbard, R. (2004). Alphabet soup: Blurring the distinction between p’s and a’s in psychological research. Theory & Psychology, 14, 295–327.
Hubbard, R. (2016). Corrupt research: The case for reconceptualising empirical management and social science. Thousand Oaks, CA: Sage.
Hurlbert, S. H., & Lombardi, C. M. (2009). Final collapse of the Neyman-Pearson decision theoretic framework and rise of the neoFisherian. Annales Zoologici Fennici, 46, 311–349.
Kaiser, H. F. (1960). Directional statistical decisions. Psychological Review, 67, 160–167.
Kruscke, J. (2015). Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan (2nd ed.). Amsterdam, the Netherlands: Elsevier.
Lehmann, E. L. (1993). The Fisher, Neyman-Pearson theories of testing hypotheses: One theory or two? Journal of the American Statistical Association, 88, 1242–1249.
Lindley, D. V. (2000). The philosophy of statistics. The Statistician, 49, 293–319.
Mayo, D. G. (1996). Error and the growth of experimental knowledge. Chicago, IL: University of Chicago Press.
Mayo, D. G. (2011). Statistical science and philosophy of science: Where do/should they meet in 2011 (and beyond)? Rationality, Markets and Morals, 2, 79–102.
Mayo, D. G. (2012). Statistical science meets philosophy of science, part 2: Shallow versus deep explorations. Rationality, Markets and Morals, 3, 71–107.
Mayo, D. G., & Cox, D. (2010). Frequentist statistics as a theory of inductive inference. In D. G. Mayo & A. Spanos (Eds.), Error and inference: Recent exchanges on experimental reasoning, reliability, and the objectivity and rationality of science (pp. 247–304). New York, NY: Cambridge University Press.
Mayo, D. G., & Spanos, A. (2006). Severe testing as a basic concept in a Neyman-Pearson philosophy of induction. British Journal for the Philosophy of Science, 57, 323–357.
Mayo, D. G., & Spanos, A. (Eds.). (2010). Error and inference: Recent exchanges on experimental reasoning, reliability, and the objectivity and rationality of science. New York, NY: Cambridge University Press.
Mayo, D. G., & Spanos, A. (2011). Error statistics. In P. S. Bandyopadhyay & M. R. Forster (Eds.), Handbook of philosophy of Science: Vol. 7. Philosophy of statistics (pp. 153–198). Amsterdam, the Netherlands: Elsevier.
McCloskey, D. N., & Ziliak, S. T. (1996). The standard error of regressions. Journal of Economic Literature, 34, 97–114.
Morrison, D. E., & Henkel, R. E. (Eds.). (1970). The significance test controversy: A reader. Chicago, IL: Aldine.
Neyman, J., & Pearson, E. S. (1933). On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London A, 231, 289–337.
Nickerson, R. S. (2000). Null hypothesis significance testing: A review of an old and continuing controversy. Psychological Methods, 5, 241–301.
Pace, L., & Salvan, A. (1997). Advanced series on statistical science and applied probability: Vol. 4. Principles of statistical inference from a neo-Fisherian perspective. Singapore: World Scientific.
Peirce, C. S. (1931–1958). The collected papers of Charles Sanders Peirce (Vols. 1–8; C. Hartshorne & P. Weiss [Eds., Vols. 1–6], & A. W. Burks [Ed., Vols. 7-8]). Cambridge, MA: Harvard University Press.
Popper, K. R. (1959). The logic of scientific discovery. London, England: Hutchinson.
Senn, S. (2001). Two cheers for P-values? Journal of Epidemiology and Biostatistics, 6, 193–204.
Spanos, A. (1999). Probability theory and statistical inference: Economic modeling with observational data. Cambridge, England: Cambridge University Press.
Spanos, A. (2010). On a new philosophy of frequentist inference: Exchanges with David Cox and Deborah G. Mayo. In D. G. Mayo & A. Spanos (Eds.), Error and inference: Recent exchanges on experimental reasoning, reliability, and the objectivity and rationality of science (pp. 315–330). New York, NY: Cambridge University Press.
Spanos, A. (2014). Recurring controversies about P values and confidence intervals revisited. Ecology, 95, 645–651.
Suppes, P. (1962). Models of data. In E. Nagel, P. Suppes, & A. Tarski (Eds.), Logic, methodology, and philosophy of science: Proceedings of the 1960 International Congress (pp. 252–261). Stanford, CA: Stanford University Press.
Trafimow, D., & Marks, M. (2015). Editorial. Basic and Applied Social Psychology, 37, 1–2.
Van Dyk, D. A. (2014). The role of statistics in the discovery of a Higgs Boson. Annual Review of Statistics and Its Applications, 1, 41–59.
Wagenmakers, E.-J. (2007). A practical solution to the pervasive problems of p values. Psychonomic Bulletin & Review, 14, 779–804.
Woodward, J. (1989). Data and phenomena. Synthese, 79, 393–472.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Haig, B.D. (2018). Tests of Statistical Significance Made Sound. In: Method Matters in Psychology. Studies in Applied Philosophy, Epistemology and Rational Ethics, vol 45. Springer, Cham. https://doi.org/10.1007/978-3-030-01051-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-01051-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01050-8
Online ISBN: 978-3-030-01051-5
eBook Packages: Behavioral Science and PsychologyBehavioral Science and Psychology (R0)