Law and Human Behavior

, Volume 29, Issue 5, pp 615–620 | Cite as

Comparing Effect Sizes in Follow-Up Studies: ROC Area, Cohen's d, and r



In order to facilitate comparisons across follow-up studies that have used different measures of effect size, we provide a table of effect size equivalencies for the three most common measures: ROC area (AUC), Cohen's d, and r. We outline why AUC is the preferred measure of predictive or diagnostic accuracy in forensic psychology or psychiatry, and we urge researchers and practitioners to use numbers rather than verbal labels to characterize effect sizes.

Key Words

effect size ROC area risk assessment predictive accuracy 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Berlin, F. S., Galbreath, N. W., Geary, B., & McGlone, G. (2003). The use of actuarials at civil commitment hearings to predict the likelihood of future sexual violence. Sexual Abuse: A Journal of Research and Treatment, 15, 377–382.CrossRefGoogle Scholar
  2. Cohen, J. (1969). Statistical power analysis for the behavioral sciences. New York: Academic Press.Google Scholar
  3. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.Google Scholar
  4. Cohen, J. (1992). A power primer. Psychological Bulletin, 122, 155–159.CrossRefGoogle Scholar
  5. Delaney, H. D., & Vargha, A. (2002). Comparing several robust tests of stochastic equality with ordinally scaled variables and small to moderate sized samples. Psychological Methods, 7, 485–503.CrossRefPubMedGoogle Scholar
  6. Harris, G. T., & Rice, M. E. (2003). Actuarial assessment of risk among sex offenders. In R. A. Prentky, E. S. Janus, & M. C. Seto (Eds.), Understanding and managing sexually coercive behavior, Vol. 989 (pp. 198–210). New York: Annals of the New York Academy of Sciences.Google Scholar
  7. Hemphill, J. F. (2003). Interpreting the magnitudes of correlation coefficients. American Psychologist, 58, 78–80.CrossRefPubMedGoogle Scholar
  8. Hilton, N. Z., Carter, A. M., Harris, G. T., & Bryans, A. (2005). Using categorical judgments to communicate risk of violence. Unpublished manuscript.Google Scholar
  9. McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. Psychological Bulletin, 111, 361–365.CrossRefGoogle Scholar
  10. Mossman, D. (1994). Assessing predictions of violence being accurate about accuracy. Journal of Consulting and Clinical Psychology, 62, 783–792.CrossRefPubMedGoogle Scholar
  11. Pearson, E. S., & Hartley, H. O. (Eds.). (1954). Biometrika tables for statisticians, Vol. 1 (1st ed.). Cambridge: Cambridge University Press.Google Scholar
  12. Rice, M. E., & Harris, G. T. (1995). Violent recidivism: Assessing predictive validity. Journal of Consulting and Clinical Psychology, 63, 737–748.CrossRefPubMedGoogle Scholar
  13. Rosenthal, R. (1990). How are we doing in soft psychology? American Psychologist, 45, 775–777.CrossRefGoogle Scholar
  14. Rosenthal, R. (1991). Meta-analytic procedures for social research. Newbury Park, CA: Sage.Google Scholar
  15. Rosenthal, R., & Rubin, D. B. (1982). A simple, general purpose display of magnitude of experimental effect. Journal of Educational Psychology, 74, 166–169.CrossRefGoogle Scholar
  16. Swets, J. A. (1986). Indices of discrimination or diagnostic accuracy: Their ROCs and implied models. Psychological Bulletin, 99, 100–117.CrossRefPubMedGoogle Scholar
  17. Swets, J. A., Dawes, R. M., & Monahan, J. (2000). Psychological science can improve diagnostic decisions. Psychological Science in the Public Interest: A Journal of the American Psychological Society, 1, 1–26.CrossRefGoogle Scholar

Copyright information

© American Psychology-Law Society/Division 41 of the American Psychological Association 2005

Authors and Affiliations

  1. 1.Mental Health CentrePenetanguisheneCanada

Personalised recommendations