How Does the Brain Do Plausible Reasoning?
We start from the observation that the human brain does plausible reasoning in a fairly definite way. It is shown that there is only a single set of rules for doing this which is consistent and in qualitative correspondence with common sense. These rules are simply the equations of probability theory, and they can be deduced without any reference to frequencies.
We conclude that the method of maximum—entropy inference and the use of Bayes’ theorem are statistical techniques fully as valid as any based on the frequency interpretation of probability. Their introduction enables us to broaden the scope of statistical inference so that it includes both communication theory and thermodynamics as special cases.
The program of statistical inference is thus formulated in a new way. We regard the general problem of statistical inference as that of devising new consistent principles by which we can translate “raw” information into numerical values of probabilities, so that the Laplace—Bayes model is enabled to operate on more and more different kinds of information. That there must exist many such principles, as yet undiscovered, is shown by the simple fact that our brains do this every day.
KeywordsCommon Sense Communication Theory Plausible Reasoning Probability Assignment Frequency Theory
Unable to display preview. Download preview PDF.
- 1.C.E. Shannon, “A Mathematical Theory of Communication,” Bell Syst. Tech. Jour. Vol. 27, pp. 379–423, 623–655; July, October, 1948. Also in au]C. E. Shannon and W. Weaver, “The Mathematical Theory of Communication,” University of Illinois Press, Urbana, 1949.Google Scholar
- 2.N. H. Abel, Grelle’s Jour., Bd. 1 (1826).Google Scholar
- 4.„La théorie des probabilités n’est que le ben sens reduit au calcul.” This occurs in the Introduction to P.S. Laplace, „Exposition de la théorie des chances at des probabilités,” Paris, 1843. The same statement, with slightly different wording, is found in the Truscott—Emory translation of P.S. Laplace, “A Philosophical Essay on Probabilities,” Dover Publications, N. Y. (1951), p. 196.Google Scholar
- 5.G. Polya, ”Mathematics and Plausible Reasoning,” Volumes I and II, Princeton University Press, 1954.Google Scholar
- 7.This notation is perhaps confusing. It can be made clearer if we suppose that the symbol for a plausibility is not (A∣B), but just A∣B, the parentheses being unnecessary. However, when one writes down more involved equations, the absence of parentheses can cause even greater confusion.3 The notation adopted here, while not entirely consistent, appears to the writer as the lesser of two evils.Google Scholar
- 8.H. Jeffreys, “Theory of Probability,” Oxford University Press, 1939.Google Scholar
- 9.This is not a direct quotation from any particular author, but a statement of what is implied by many authors. For example, see Ref. 10, pp. 150–151, or Ref. 12, pp 4–6.Google Scholar
- 10.H. Cramér, “Mathematical Methods of Statistics,” Princeton University Press, 1946.Google Scholar
- 11.Reference 5, Vol. II, p. 136. For other examples, see Ref. 8, pp. 107–110, and Ref. 12, p. 64.Google Scholar
- 12.W. Feller, “An Introduction to Probability Theory and its Applications,” John Wiley and Sons, Inc., N.Y., 1950. Any reader familiar with this book will see at once that the present paper is largely a reaction against and search for an alternative to, the philosophical views expressed therein. I believe this is necessary if probability theory is to meet all the needs of science and engineering. But no one can challenge Feller’s beautiful mathematical results, the validity of which does not depend on how we choose to interpret them. They are as useful in Laplace’s theory as in the frequency theory.zbMATHGoogle Scholar
- 13.This is far from being a precise statement. The derivation of Eq. (6–13) shows in more detail what is required for the law of succession to apply.Google Scholar
- 14.However, it served Laplace very well indeed. The following procedure led him to some of the most important discoveries in celestial mechanics. Noting a discrepancy between observation and existing theory, he would break down the situation into alternatives which seemed intuitively “equally possible.” He would then compare the probability that a discrepancy of this size is due to a systematic effect, with the probability that it is due to errors of observation. Whenever the ratio was sufficiently high, he would decide that this is a problem worth working on, and attack it. He was, in fact, using Wald’s decision theory, in exactly the way developed recently by Middleton, van Meter, and others for the detection of signals in noise.Google Scholar
- 15.Ref. 10, pp. 507–524.Google Scholar
- 16.E. T. Jaynes, “Information Theory and Statistical Mechanics,” Physical Review, Vol. 106, pp. 620–630; May 15, 1957. At the time of writing this, I was under the impression that the frequency theory and Laplace’s theory are parallel, co-equal theories using the same mathematical rules. However, the arguments of the present paper show that the frequency theory is only a special case of Laplace’s theory.MathSciNetCrossRefGoogle Scholar
- 17.E.T. Jaynes, “Information Theory and Statistical Mechanics II,” Submitted to the Physical Review.Google Scholar
- 18.E. T. Jaynes, “Poincaré Recurrence Times and Statistical Mechanics,” Submitted to the Physical Review.Google Scholar
- 19.This can be stated in a more precise epsilon-delta language, but the reader will anticipate that the conclusions are largely independent of what we mean by “reasonably probable,” for the same reason as in Shannon’s theorem 4.Google Scholar
- 20.(D f∣A p) is a probability density, (D f∣A p) df being a probability. Since, however, the differentials cancel out of equations and the distinction is already determined by whether the variable is continuous or discrete, there is no need to invent a new notation. On the other hand, it is essential in this theory that we do distinguish in notation between a probability and a frequency.Google Scholar