Skip to main content

How Good were those Probability Predictions? The Expected Recommendation Loss (ERL) Scoring Rule

  • Chapter
Maximum Entropy and Bayesian Methods

Part of the book series: Fundamental Theories of Physics ((FTPH,volume 62))

Abstract

We present a new way to understand and characterize the choice of scoring rule (probability loss function) for evaluating the performance of a supplier of probabilistic predictions after the outcomes (true classes) are known. The ultimate value of a prediction (estimate) lies in the actual utility (loss reduction) accruing to one who uses this information to make some decision(s). Often we cannot specify with certainty that the prediction will be used in a particular decision problem, characterized by a particular loss matrix (indexed by outcome and decision), and thus having a particular decision threshold. Instead, we consider the more general case of a distribution over such matrices. The proposed scoring rule is the expectation, with respect to this distribution, of the loss that is actually incurred when following the decision recommendation, the latter being the decision that would be considered optimal if we were to assume the predicted probabilities. Logarithmic and quadratic scoring rules arise from specific examples of these distributions, and even common single-threshold measures such as the ordinary misclassification score obtain from degenerate special cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. Aczél. Remarks on the measurement of subjective probability and information. Metrika, 11(2):91–105, 1966.

    MathSciNet  MATH  Google Scholar 

  2. José M. Bernardo. Expected information as expected utility. Ann. Stat., 7 (3): 686–691, 1979.

    Article  MATH  Google Scholar 

  3. G. W. Brier. Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78 (1): 1–3, 1950.

    Article  Google Scholar 

  4. P. Fischer. On the inequality Σpi f (pi) ≥ Σ p i f (qi). Metrika, 18: 199–208, 1972.

    Article  MATH  Google Scholar 

  5. H. Gish. A probabilistic approach to the understanding and training of neural network classifiers. In IEEE Int’l Conf. on Acoustics, Speech and Signal Processing, pages 1361–1364, April 1990.

    Google Scholar 

  6. I. J. Good. Rational decisions. J. of the Royal Stat. Soc. B, 14: 107–114, 1952.

    MathSciNet  Google Scholar 

  7. John B. Hampshire II and Barak Pearlmutter. Equivalence proofs for multi-layer perceptron classifiers and the Bayesian discriminant function. In Connectionist Models: Proc. of the 1990 Summer School, pages 159–172. San Mateo, CA: Morgan Kaufmann Publishers, 1991.

    Google Scholar 

  8. Charles E. Metz. Basic principles of ROC analysis. Seminars Nuclear Med., 8 (4): 283–298, 1978.

    Article  Google Scholar 

  9. John W. Miller, Rod Goodman, and Padhraic Smyth. On loss functions which minimize to conditional expected values and posterior probabilities. IEEE Tr: Information Theory,39(4):1404–1408, 1993. [10]

    Google Scholar 

  10. Gy. Muszély. On continuous solutions of a functional inequality. Metrika, 19: 65–69, 1973.

    Article  Google Scholar 

  11. David B. Rosen. Cross-entropy vs. squared error vs. misclassification: On the relationship among loss functions. Submitted. For preprint info., e-mail d.rosen@ieee.org with Subject: QUERY PAPER CSM.

    Google Scholar 

  12. David B. Rosen. Issues in selecting empirical performance measures for probabilistic classifiers. In Kenneth Hanson and Richard Silver, editors, Maximum Entropy and Bayesian Methods (Proceedings of the Fifteenth International Workshop, July 1995). Kluwer, Dordrecht, The Netherlands, 1996. Paper title subject to revision. To appear. For preprint info., e-mail d.rosen@ieee.org with Subject: QUERY PAPER ISEPM.

    Google Scholar 

  13. Leonard J. Savage. Elicitation of personal probabilities and expectations. J. of the American Stat. Assoc., 66 (336): 783–801, 1971.

    Article  MATH  Google Scholar 

  14. Robert L. Winkler. Scoring rules and the evaluation of probability assessors. J. of the American Stat. Assoc., 64: 1073–1078, 1969.

    Article  Google Scholar 

  15. J. Frank Yates. External correspondence: Decompositions of the mean probability score. Organizational Behavior and Human Performance, 30: 132–156, 1982.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Rosen, D.B. (1996). How Good were those Probability Predictions? The Expected Recommendation Loss (ERL) Scoring Rule. In: Heidbreder, G.R. (eds) Maximum Entropy and Bayesian Methods. Fundamental Theories of Physics, vol 62. Springer, Dordrecht. https://doi.org/10.1007/978-94-015-8729-7_33

Download citation

  • DOI: https://doi.org/10.1007/978-94-015-8729-7_33

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-90-481-4407-5

  • Online ISBN: 978-94-015-8729-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics