A Conceptual Reply to Reverend Bayes: The Frequentist Approach

Vallverdú, Jordi

doi:10.1007/978-3-662-48638-2_4

Jordi Vallverdú²

Part of the book series: SpringerBriefs in Statistics ((BRIEFSSTATIST))

2409 Accesses

Abstract

The response to subjective probabilities of the Bayesian approach was frequentism, that is, the analysis of long-run series of frequencies of an event from which came the possibilities to extract statistical data. Frequentism became the dominant view in scientific practices during most of the twentieth century. This academic view was espoused by several authors, like Pearson, Fisher, Gosset, and Neyman–Pearson, not all of them agreeing about the best ways to perform this statistical approach. The main ideas and internal debates are analyzed here.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
From Smith et al. (1890): “CAPITA AUT NAVIA head or tail, the name of a game at ‘pitch and toss,’ derived from the fact that early as had on one side a double-faced Janus, on the other the prow of a ship. See cut of as on p. 202. (Macr. 1.7, 22; Fest. s. v. Navia, p. 169 M.)”.
2.
Well, we could discuss how personal beliefs cannot affect a frequentism experiment in the selection of the good variables that can affect or not affect the experiment or, even, the meaning of “a large number of trials,” something completely subjective and that can vary from discipline to discipline or even from laboratory to laboratory. For a nice and opposed vision of my ideas on this topic, read Gillies (200): 152–153. Anyhow, the selection of the basic range of trials is absolutely subjective.
3.
Of great philosophical interest is his The Grammar of Science (1892), a book about which the young Albert Einstein was enthusiastic, and also received critics from Lenin on the debate between materialism and idealism, Lenin, V.I. (1909) Materialism and Empirio-Criticism. Critical Comments on a Reactionary Philosophy, Ch. 5, Sect. 2. Lenin described him as a “machian.” Read this section at: http://www.marxists.org/archive/lenin/works/1908/mec/five2.htm, accessed on August, 13, 2013. At the same time, we need to explain that Pearson wrote on epistemological issues: For example, he considered that knowledge came from sensations and that probability tried to find invariability among these groups pf sensations (shared with other individuals as a “sameness experience”). Lenin quoted him on this topic at V.I. Lenin (1908) Materialism and Empirio-Criticism, Critical Comments on a Reactionary Philosophy, Chapter One: The Theory of Knowledge of Empirio-Criticism and of Dialectical Materialism. (1) Sensations And Complexes Of Sensations. There is a strong connection between his notions of science and the future of human societies (Norton 1978).
4.
Pearson is also the longest “sleeping beauty” in the history of science; he wrote “On lines and planes of closest fit to systems of points in space” in 1901, but his ideas gained approval only in 2002. It indicates that changes in society and advances in understanding can breathe new life into sometimes long-forgotten science papers. See the last research on this topic at Ke et al. (2015).
5.
According to Loucã (2008: 3), Fisher’s model of Mendelian populations was metaphorized from the molecular models of statistical mechanics applied to gases.
6.
R.C. Punnet, and Leonard and Horace Darwin, both sons of Charles Darwin, were also members of this society. The friendship between Leonard Darwin and R.A. Fisher was very intense, sharing their interest in eugenics and evolutionary ideas. Besides, Leonard gave support to Fisher at the beginning of his career. Also Pearson was close to these circles, in this case friend of Ida Darwin, wife of Horace Darwin, as some correspondence shows (http://www.eugenicsarchive.org/html/eugenics/static/images/2140.html, accessed on August 12th, 2014). Again, eugenics was the common point.
7.
Later, in 1952, Kruskal and Wallis published a paper in which they provided a nonparametric equivalent of the ANOVA. And John Tukey designed an ANOVA in 1951 a posteriori multiple comparison tool: the Tukey’s HSD (for honestly significant difference test). Tukey’s multiple comparison test is one of several tests that can be used to determine which means among a set of means differ from the rest. There are more multiple comparison tests, including Scheffe’s test and Dunnett’s test.
8.
Curiously, in the early 1930s, most statisticians regarded fiducial probability and Neyman’s confidence intervals as synonymous (Louçã 2008: 22).
9.
According to Gillies (personal communication), it can be affirmed, and it is worth noting that Popper was also criticizing the notion of inductive reasoning in the 1930s. Generally, classical (or frequentist) statistics was very much in agreement with Popper’s anti-inductivist methodology of conjectures and refutations, except for the fiduciary argument which is definitely inductivist in character.
10.
When Neyman moved to the University of California at Berkeley, he transformed that place into an anti-Bayesian powerhouse.
11.
The second half of K. Fisher divided position was given to R.A. Fisher, a next author to be covered in this chapter. Despite the objections of Karl Pearson, his laboratory was divided into separate departments of statistics and eugenics. His son became head of the new Department of Statistics, while R.A. Fisher was elected as Galton Professor of National Eugenies (Inman 1994: 4).
12.
Curiously, George Box, a former student of Pearson, became Bayesian and even married one of Fisher’s daughters…some years after he divorced her, because his wife had inherited a temper much like her father.
13.
There is also a Type III error: when you get the right answer to the wrong question. This is sometimes called a Type 0 error. This error arises from a two-sided test, when one side is erroneously favoured although the true effect actually resides on the other side. It is not a false positive but a crossed causal relationship. See Schwartz and Carpenter (1999); a more recent analysis in Heinz and Waldhoer (2012).
14.
This obvious fact should introduce modesty in statisticians as well as a careful epistemological attitude, (Boffetta et al. 2008; Blair et al. 2009).
15.
As Meehl (1990), p. 110, explained quoting the words of American philosopher Morris Raphael Cohen: “All logic tests are divided into two parts. In the first part, on deductive logic, the fallacies are explained; in the second part, on inductive logic, they are committed.” Induction is one of the oldest and more conflictive problems in the history of philosophy.
16.
See the acid analysis of Nuzzo (2014). As she points, p. 151: “Neyman called some of Fisher’s work mathematically “worse than useless”; Fisher called Neyman’s approach “childish” and “horrifying (for) intellectual freedom in the west.” As Bertsch (2011: 46) compiles from his contemporaries, Fisher was aggressive, unpolite, fiery tempered. About p-values, Nuzzo is even sardonic (p. 150): “P values have always had critics. In their almost nine decades of existence, they have been lik-ened to mosquitoes (annoying and impossible to swat away), the emperor’s new clothes (fraught with obvious problems that everyone ignores) and the tool of a “sterile intellectual rake” who ravishes science but leaves it with no progeny. One researcher suggested rechristening the methodology “statistical hypothesis inference testing”, presumably for the acronym it would yield…and she meant SHIT….
17.
About this debate, read the precise paper of Gillies (1971).
18.
Trafimov and Marks (2015).
19.
Head talks about “p-hacking.” Read Head et al. (2015).
20.
Savage was a brilliant and ironic researcher, and at the same time, he showed deep interests in philosophical aspects of statistical concepts. For example, in “The Foundations of statistics Revisisted” he wrote: “Fisher’s school, with its emphasis on fiducial probability-a bold attempt to make the Bayesian omelet without breaking the Bayesian eggs—may be regarded as an exception to the rule that frequentists leave great latitude for subjective choice in statistical analysis” (1961, p. 578), published at Berkeley Symposium on Mathematical Statistics and Probability. He considered that “Personal probability at present provides an excellent base of operations from which to criticize and advance statistical theory.” Bayesianism had with him a strong leader and deep thinker.
21.
Curiously, Raiffa wrote most of his ideas as books, with some scarce and minor papers. And, again, his beginnings were related to military war efforts. Read the wonderful paper about his life (as an interview) made by Fienberg (2008).

References

Blair, A., et al. (2009). Epidemiology, public health and the rhetoric of false positives. Environmental Health Perspectives, 117(12), 1809–1813.
Article Google Scholar
Boffetta, P., et al. (2008). False positive results in cancer epidemiology: A plea for epistemological modesty. Journal of the National Cancer Institute, 100, 988–995.
Article Google Scholar
Cousins, R. D. (1995). Why isn’t every physicist a Bayesian? American Journal of Physics, 63(5), 198–410.
Article Google Scholar
Fienberg, S. E. (2008). The early statistical years: 1947–1967 a conversation with Howard Raiffa. Statistical Science, 23(1), 136–149.
Article Google Scholar
Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society of London. Series A, 222, 309–368.
Google Scholar
Fisher, R. A. (1950). Gene frequencies in a cline determined by selection and diffusion. Biometrics, 6(4), 353–361.
Article Google Scholar
Freedman, D. A., Pisani, R., & Purves, R. A. (1979). Statistics. NY: W.W. Norton.
Google Scholar
Gillies, D. A. (1971). A falsifying rule for probability statements. British Journal for the Philosophy of Science, 22, 231–261.
Article Google Scholar
Grattan-Guinness, I. (Ed.). (1994). Companion encyclopaedia of the history and philosophy of the mathematical sciences. London: Routledge.
Google Scholar
Head, M. L., Holman, L., Lanfear, R., Kahn, A. T., & Jennions, M. D. (2015). The extent and consequences of P-Hacking in science. PLoS Biology, 13(3), e1002106.
Article Google Scholar
Heinz, H., & Waldhoer, T. (2012). Relevance of the type III error in epidemiological maps. International Journal of Health Geographics, 11(34), 1–9.
Google Scholar
Howie, D. (2002). Interpreting probability: Controversies and developments in the early twentieth century. Oxford: OUP.
Book Google Scholar
Inman, H. F. (1994). Karl Pearson and R.A. Fisher on statistical tests: A 1935 exchange from nature. The American Statistician, 48(1), 2–11.
Article Google Scholar
Ke, Q., Ferrara E., Radicchi, F., & Flammini, A. (2015). Proceedings of the National Academy of Sciences USA. http://dx.doi.org/10.1073/pnas.1424329112. Accessed in May 26, 2015.
Kruskal, W. (1952). Use of ranks in one-criterion variance analysis. Journal of the American Statistical Asociation, 47(260), 583–621.
Google Scholar
Lakatos, I. (1978). The methodology of scientific research programmes: Philosophical papers (Vol. 1). Cambridge: Cambridge University Press.
Book Google Scholar
Lehmann, E. L. (1993). The Fisher, Neyman-Pearson theories of testing hypotheses: One theory or two? Journal of the American Statistical Association, 88(424), 1242–1249.
Article Google Scholar
Lehmann, E. L. (1994). Jerzy Neyman 1894–1981. A biographical memoir. Washington D.C.: National Academy of Sciences.
Google Scholar
Lenhard, J. (2006). Models and statistical inference: The controversy between Fisher and Neyman–Pearson. British Journal for the Philosophy of Science, 57, 69–91.
Article Google Scholar
Louçã, F. (2008). The widest cleft in statistics—How and why Fisher opposed Neyman and Pearson. Working Papers WP 02/2008/DE/UECE. Lisbon: ISEG.
Google Scholar
McGrayne, S. B. (2011). The theory that would not die. How Bayes’ rule cracked the enigma code, hunted down Russian submarines, and emerged triumphant from two centuries of controversy. USA: Yale University Press.
Google Scholar
Meehl, P. E. (1990). Appraising and amending theories: The strategy of Lakatosian defense and two principles that warrant using it. Psychological Inquiry, 1, 108–141.
Article Google Scholar
Neyman, J. (1928). On the use and interpretation of certain test criteria for purposes of statistical inference: Part I. Biometrika, 20A(1/2), 175–240.
Google Scholar
Neyman, J. (1949). Foundations of the general theory of estimation. Actualites Scientifiques et Industreilles, 1951(1146), 85.
Google Scholar
Norton, B. J. (1978). Karl Pearson and statistics: The social origins of scientific innovation. Social Studies of Science, 8(1), 3–34.
Article Google Scholar
Nuzzo, R. (2014). Statistical errors. Nature, 506, 150–152.
Article Google Scholar
Pearson, K. (1900). On the criterion that a given system of deviations from the probable in case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine Series, 50(302), 157–175.
Google Scholar
Plackett, R. L. (1983). Karl Pearson and the chi-squared test. International Statistical Review, 51(1), 59–72.
Article Google Scholar
Popper, K. (1934). Logik der Forschung: Zur erkenntnistheorie der modernen naturwissenschaft. Vienna: Springer.
Google Scholar
Robinson, D. H., & Wainer, H. (2001). On the past and future of null hypothesis significance testing. Research report RR-01-24. Princeton: ETS.
Google Scholar
Savage, L. J. (1961). Berkeley symposium on mathematical statistics and probability. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability (Vol. 1, pp. 575–586). Berkeley, California: University of California Press.
Google Scholar
Schwartz, S., & Carpenter, K. M. (1999). The right answer for the wrong question: Consequences of type III error for public health research. American Journal of Public Health, 89(8), 1175–1180.
Article Google Scholar
Smith, W., William Wayte, L. L. D., & Marindin, G. E. (1890). A dictionary of greek and roman antiquities. London: John Murray.
Google Scholar
Stigler, S. M. (2012). Karl Pearson and the rule of three. Biometrika, 99(1), 1–14.
Article Google Scholar
Student. (1908). The probable error of a mean. Biometrika, 6, 1–25.
Article Google Scholar
Trafimov, D., & Marks, M. (2015). Editorial. Basic and Applied Psychology, 37, 1–2.
Article Google Scholar
Wagenmakers, E. J. (2007). A practical solution to the pervasive problems of p-values. Psychonomic Bulletin & Review, 14(5), 779–804.
Article Google Scholar
Wald, A. (1939). Contributions to the Theory of Statistical Estimation and Testing Hypotheses. Annals of Mathematical Statistics, 10(4), 299–326.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Philosophy and Arts, Universitat Autònoma de Barcelona, Bellaterra (Cerdanyola del Vallès), Spain
Jordi Vallverdú

Authors

Jordi Vallverdú
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jordi Vallverdú .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Vallverdú, J. (2016). A Conceptual Reply to Reverend Bayes: The Frequentist Approach. In: Bayesians Versus Frequentists. SpringerBriefs in Statistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-48638-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-662-48638-2_4
Published: 07 November 2015
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-48636-8
Online ISBN: 978-3-662-48638-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics