Advertisement

Journal of Business and Psychology

, Volume 34, Issue 1, pp 19–37 | Cite as

A 20-Year Review of Outcome Reporting Bias in Moderated Multiple Regression

  • Ernest O’BoyleEmail author
  • George C. Banks
  • Kameron Carter
  • Sheryl Walter
  • Zhenyu Yuan
Original Paper

Abstract

Moderated multiple regression (MMR) remains the most popular method of testing interactions in management and applied psychology. Recent discussions of MMR have centered on their small effect sizes and typically being statistically underpowered (e.g., Murphy & Russell, Organizational Research Methods, 2016). Although many MMR tests are likely plagued by type II errors, they may also be particularly prone to outcome reporting bias (ORB) resulting in elevated false positives (type I errors). We tested the state of MMR through a 20-year review of six leading journals. Based on 1218 MMR tests nested within 343 studies, we found that despite low statistical power, most MMR tests (54%) were reported as statistically significant. Further, although sample size has remained relatively unchanged (r = − .002), statistically significant MMR tests have risen from 41% (1995–1999) to 49% (2000–2004), to 60% (2005–2009), and to 69% (2010–2014). This could indicate greater methodological and theoretical precision but leaves open the possibility of ORB. In our review, we found evidence that both increased rigor and theoretical precision play an important role in MMR effect size magnitudes, but also found evidence for ORB. Specifically, (a) smaller sample sizes are associated with larger effect sizes, (b) there is a substantial frequency spike in p values just below the .05 threshold, and (c) recalculated p values less than .05 always converged with authors’ conclusions of statistical significance but recalculated p values between .05 and .10 only converged with authors’ conclusions about half (54%) of the time. The findings of this research provide important implications for future application of MMR.

Keywords

Outcome reporting bias Publication bias Questionable reporting practices Moderated multiple regression Meta-analysis 

References

  1. Aguinis, H., & Gottfredson, R. K. (2010). Best-practice recommendations for estimating interaction effects using moderated multiple regression. Journal of Organizational Behavior, 31, 776–786.  https://doi.org/10.1002/job.686.Google Scholar
  2. Aguinis, H., & Stone-Romero, E. F. (1997). Methodological artifacts in moderated multiple regression and their effects on statistical power. Journal of Applied Psychology, 82, 192–206.  https://doi.org/10.1037//0021-9010.82.1.192.Google Scholar
  3. Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. Thousand Oaks, CA: Sage.Google Scholar
  4. Antonakis, J. (2017). On doing better science: From thrill of discovery to policy implications. The Leadership Quarterly, 28(1), 5–21.Google Scholar
  5. Banks, G. C., Kepes, S., & McDaniel, M. A. (2015). Publication bias: Understanding the myths concerning threats to the advancement of science. In C. E. Lance & R. J. Vandenberg (Eds.), More statistical and methodological myths and urban legends (pp. 36–64). New York, NY: Routledge.Google Scholar
  6. Banks, G. C., & McDaniel, M. A. (2011). The kryptonite of evidence-based I-O psychology. Industrial and Organizational Psychology: Perspectives on Science and Practice, 4, 40–44.  https://doi.org/10.1111/j.1754-9434.2010.01292.x.Google Scholar
  7. Banks, G. C., O’Boyle, E. H., Pollack, J. M., White, C. D., Batchelor, J. H., Whelpley, C. E., et al. (2016). Questions about questionable research practices in the field of management: A guest commentary. Journal of Management, 42, 5–20.  https://doi.org/10.1177/0149206315619011.Google Scholar
  8. Banks, G. C., Rogelberg, S. G., Woznyj, H. M., Landis, R. S., & Rupp, D. E. (2016). Evidence on questionable research practices: The good, the bad, and the ugly. Journal of Business and Psychology, 31, 323–338.  https://doi.org/10.1007/s10869-01609456-7.Google Scholar
  9. Baron, R. M., & Kenny, D. A. (1986). The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173–1182.  https://doi.org/10.1037/0022-3514.51.6.1173.Google Scholar
  10. Bennett, R. J., & Robinson, S. L. (2000). Development of a measure of workplace deviance. Journal of Applied Psychology, 85, 349–360.  https://doi.org/10.1037/0021-9010.85.3.349.Google Scholar
  11. Bergh, D. D., Sharp, B. M., & Li, M. (2017). Tests for identifying “red flags” in empirical findings: Demonstration and recommendations for authors, reviewers, and editors. Academy of Management Learning and Education, 16, 110–124.  https://doi.org/10.5465/amle.2015.0406.Google Scholar
  12. Biemann, T. (2013). What if we were Texas sharpshooters? Predictor reporting bias in regression analysis. Organizational Research Methods, 16, 335–363.  https://doi.org/10.1177/1094428113485135.Google Scholar
  13. Bobko, P. (1986). A solution to some dilemmas when testing hypotheses about ordinal interactions. Journal of Applied Psychology, 71, 323–326.  https://doi.org/10.1037/0021-9010.71.2.323.Google Scholar
  14. Bosco, F. A., Aguinis, H., Field, J. G., Pierce, C. A., & Dalton, D. R. (2016). HARKing’s threat to organizational research: Evidence from primary and meta-analytic sources. Personnel Psychology, 69, 709–750.  https://doi.org/10.1111/peps.12111.Google Scholar
  15. Bosco, F. A., Aguinis, H., Singh, K., Field, J. G., & Pierce, C. A. (2015). Correlational effect size benchmarks. Journal of Applied Psychology, 100, 431–449.  https://doi.org/10.1037/a0038047.Google Scholar
  16. Cohen, J. E. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.Google Scholar
  17. Cohen, J., & Cohen, P. (1983). Applied multiple regression/correlation analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.Google Scholar
  18. Cortina, J. M. (1993). Interaction, nonlinearity, and multicollinearity: Implications for multiple regression. Journal of Management, 19, 915–922.  https://doi.org/10.1016/0149-2063(93)90035-L.Google Scholar
  19. Cortina, J. M., Green, J. P., Keeler, K. R., & Vandenberg, R. J. (in press). Degrees of freedom in SEM: Are we testing the models that we claim to test? Organizational Research Methods. 1094428116676345.Google Scholar
  20. Cronbach, L. J. (1987). Statistical tests for moderator variables: Flaws in analyses recently proposed. Psychological Bulletin, 102, 414–417.  https://doi.org/10.1037/0033-2909.102.3.414.Google Scholar
  21. de Winter, J. C., & Dodou, D. (2015). A surge of p-values between 0.041 and 0.049 in recent decades (but negative results are increasing rapidly too). PeerJ, 3, e733.  https://doi.org/10.7717/peerj.733.Google Scholar
  22. Editors. (1909). The reporting of unsuccessful cases. The Boston Medical and Surgical Journal, 161, 263–264.  https://doi.org/10.1056/NEJM190908191610809.
  23. Edwards, J. R., & Berry, J. W. (2010). The presence of something or the absence of nothing: Increasing theoretical precision in management research. Organizational Research Methods, 13, 668–689.  https://doi.org/10.1177/1094428110380467.Google Scholar
  24. Emerson, G. B., Warme, W. J., Wolf, F. M., Heckman, J. D., Brand, R. A., & Leopold, S. S. (2010). Testing for the presence of positive-outcome bias in peer review: A randomized controlled trial. Archives of Internal Medicine, 170, 1934–1939.  https://doi.org/10.1001/archinternmed.2010.406.Google Scholar
  25. Evans, M. G. (1985). A Monte Carlo study of the effects of correlated method variance in moderated multiple regression analysis. Organizational Behavior and Human Decision Processes, 36, 305–323.  https://doi.org/10.1016/0749-5978(85)90002-0.Google Scholar
  26. Fanelli, D. (2012). Negative results are disappearing from most disciplines and countries. Scientometrics, 90, 891–904.  https://doi.org/10.1007/s11192-011-0494-7.Google Scholar
  27. Finkel, E. J., Eastwick, P. W., & Reis, H. T. (2015). Best research practices in psychology: Illustrating epistemological and pragmatic considerations with the case of relationship science. Journal of Personality and Social Psychology, 108, 275–297.  https://doi.org/10.1037/pspi0000007.Google Scholar
  28. Franco, A., Malhotra, N., & Simonovits, G. (2014). Publication bias in the social sciences: Unlocking the file drawer. Science, 345, 1502–1505.  https://doi.org/10.1126/science.1255484.Google Scholar
  29. Gerber, A. S., & Malhotra, N. (2008a). Do statistical reporting standards affect what is published? Publication bias in two leading political science journals. Quarterly Journal of Political Science, 3, 313–326.  https://doi.org/10.1561/100.00008024.Google Scholar
  30. Gerber, A. S., & Malhotra, N. (2008b). Publication bias in empirical sociological research: Do arbitrary significance levels distort published results? Sociological Methods & Research, 37, 3–30.  https://doi.org/10.1177/0049124108318973.Google Scholar
  31. Grand, J. A., Rogelberg, S. G., Banks, G. C., Landis, R. S., Tonidandel, S. (in press). From outcome to process focus: Fostering a more robust psychological science through registered reports and results-blind reviewing. Perspectives on Psychological Science.Google Scholar
  32. Greco, L. M., O’Boyle, E. H., Cockburn, B. S., & Yuan, Z. (in press). A reliability generalization examination of organizational behavior constructs. Journal of Management Studies. Google Scholar
  33. Greenwald, A. G. (1975). Consequences of prejudice against the null hypothesis. Psychological Bulletin, 82, 1–20.  https://doi.org/10.1037/h0076157.Google Scholar
  34. Hardwicke, T. E., Mathur, M., MacDonald, K., Nilsonne, G., Banks, G. C., Kidwell, M. C., ... Tessler, M. H. (2018). Data availability, reusability, and analytic reproducibility: Evaluating the impact of a mandatory open data policy at the journal Cognition.Google Scholar
  35. Hartgerink, C. H., van Aert, R. C., Nuijten, M. B., Wicherts, J. M., & Van Assen, M. A. (2016). Distributions of p-values smaller than .05 in psychology: What is going on? PeerJ, 4, e1935.  https://doi.org/10.7717/peerj.1935.Google Scholar
  36. Hollenbeck, J. R., & Wright, P. M. (2016). Harking, sharking, and tharking: Making the case for post hoc analysis of scientific data. Journal of Management, 43, 5–18.  https://doi.org/10.1177/0149206316679487.Google Scholar
  37. Ioannidis, J. P. A. (2008). Why most discovered true associations are inflated. Epidemiology, 19, 640–648.  https://doi.org/10.1097/EDE.0b013e31818131e7.Google Scholar
  38. Jaccard, J., Wan, C. K., & Turrisi, R. (1990). The detection and interpretation of interaction effects between continuous variables in multiple regression. Multivariate Behavioral Research, 25, 467–478.  https://doi.org/10.1207/s15327906mbr2504_4.Google Scholar
  39. James, L. R., & Brett, J. M. (1984). Mediators, moderators, and tests for mediation. Journal of Applied Psychology, 69, 307–321.  https://doi.org/10.1037/0021-9010.69.2.307.Google Scholar
  40. John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23, 524–532.  https://doi.org/10.1177/0956797611430953.Google Scholar
  41. Journal Citation Reports® (2014). Social Science Edition. (Thompson Reuters, 2015). http://jcr.incites.thomsonreuters.com.
  42. Kepes, S., Banks, G. C., McDaniel, M. A., & Whetzel, D. L. (2012). Publication bias in the organizational sciences. Organizational Research Methods, 15, 624–662.  https://doi.org/10.1177/1094428112452760.Google Scholar
  43. Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2, 196–217.  https://doi.org/10.1207/s15327957pspr0203_4.Google Scholar
  44. Krawczyk, M. (2015). The search for significance: A few peculiarities in the distribution of P values in experimental psychology literature. PLoS One, 10(6), e0127872.  https://doi.org/10.1371/journal.pone.0127872.Google Scholar
  45. Kühberger, A., Fritz, A., & Scherndl, T. (2014). Publication bias in psychology: A diagnosis based on the correlation between effect size and sample size. PLoS One, 9(9), e105825.  https://doi.org/10.1371/journal.pone.0105825.Google Scholar
  46. LeBreton, J. M. (2016). Editorial. Organizational Research Methods, 19, 3–7.  https://doi.org/10.1177/1094428115622097.Google Scholar
  47. LeBreton, J. M., Tonidandel, S., & Krasikova, D. V. (2013). Residualized relative importance analysis: A technique for the comprehensive decomposition of variance in higher order regression models. Organizational Research Methods, 16, 449–473.  https://doi.org/10.1177/1094428113481065.Google Scholar
  48. Leggett, N. C., Thomas, N. A., Loetscher, T., & Nicholls, M. E. (2013). The life of p: “Just significant” results are on the rise. The Quarterly Journal of Experimental Psychology, 66, 2303–2309.  https://doi.org/10.1080/17470218.2013.863371.Google Scholar
  49. Masicampo, E. J., & Lalande, D. R. (2012). A peculiar prevalence of p values just below. 05. The Quarterly Journal of Experimental Psychology and Aging, 65, 2271–2279.  https://doi.org/10.1080/17470218.2012.711335.Google Scholar
  50. Matthes, J., Marquart, F., Naderer, B., Arendt, F., Schmuck, D., & Adam, K. (2015). Questionable research practices in experimental communication research: A systematic analysis from 1980 to 2013. Communication Methods and Measures, 9(4), 193–207.  https://doi.org/10.1080/19312458.2015.1096334.Google Scholar
  51. Murphy, K. R., & Russell, C. J. (2016). Mend it or end it: Redirecting the search for interactions in the organizational sciences. Organizational Research Methods. 1094428115625322.Google Scholar
  52. Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., et al. (2015). Promoting an open research culture: The TOP guidelines for journals. Science, 348, 1422–1425.  https://doi.org/10.1126/science.aab2374.Google Scholar
  53. Nosek, B. A., & Bar-Anan, Y. (2012). Scientific utopia: I. Opening scientific communication. Psychological Inquiry, 23(3), 217–243.  https://doi.org/10.1080/1047840X.2012.692215.Google Scholar
  54. Nuijten, M. B., Hartgerink, C. H., van Assen, M. A., Epskamp, S., & Wicherts, J. M. (2016). The prevalence of statistical reporting errors in psychology (1985–2013). Behavior Research Methods, 48, 1205–1226.  https://doi.org/10.3758/s13428-015-0664-2.Google Scholar
  55. O’Boyle, E. H., Banks, G. C., & Gonzalez-Mulé, E. (2017). The chrysalis effect: How ugly initial results metamorphosize into beautiful articles. Journal of Management, 43, 376–399 doi: 0149206314527133.Google Scholar
  56. Orlitzky, M. (2012). How can significance tests be deinstitutionalized? Organizational Research Methods, 15, 199–228.  https://doi.org/10.1177/109442811428356.Google Scholar
  57. Porter, T. M. (1992). Quantification and the accounting ideal in science. Social Studies of Science, 22, 633–652.  https://doi.org/10.1177/030631292022004004.Google Scholar
  58. Robinson, S. L., & Bennett, R. J. (1995). A typology of deviant workplace behaviors: A multidimensional scaling study. Academy of Management Journal, 38, 555–572.  https://doi.org/10.2307/256693.Google Scholar
  59. Rosenthal, R. (1979). The file drawer problem and tolerance for null results. Psychological Bulletin, 86, 638–641.  https://doi.org/10.1037/0033-2909.86.3.638.Google Scholar
  60. Russell, C. J., & Bobko, P. (1992). Moderated regression analysis and Likert scales: Too coarse for comfort. Journal of Applied Psychology, 77, 336–342.  https://doi.org/10.1037//0021-9010.77.3.336.Google Scholar
  61. Scandura, T. A., & Williams, E. A. (2000). Research methodology in management: Current practices, trends, and implications for future research. Academy of Management Journal, 43, 1248–1264.  https://doi.org/10.2307/1556348.Google Scholar
  62. Schmidt, F. L., & Hunter, J. E. (2015). Methods of meta-analysis: Correcting error and bias in research findings (3rd ed.). Thousand Oaks, CA: Sage.Google Scholar
  63. Schwab, A., & Starbuck, W. H. (in press). A call for openness in research reporting: How to turn covert practices into helpful tools. Academy of Management Learning and Education, 16, 125–141.  https://doi.org/10.5465/amle.2016.0039.
  64. Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22, 1359–1366.  https://doi.org/10.1177/0956797611417632.Google Scholar
  65. Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2015). Better P-curves: Making P-curve analysis more robust to errors, fraud, and ambitious P-hacking, a reply to Ulrich and Miller (2015). Journal of Experimental Psychology: General, 144, 1146–1152.  https://doi.org/10.1037/xge0000104.Google Scholar
  66. Song, F., Parekh, S., Hooper, L., Loke, Y. K., Ryder, J., Sutton, A. J., et al. (2010). Dissemination and publication of research findings: An updated review of related biases. Health Technology Assessment, 14, 1–220.  https://doi.org/10.3310/hta14080.Google Scholar
  67. Spector, P. E., & Fox, S. (2005). The stressor-emotion model of counterproductive work behavior. In S. Fox & P. E. Spector (Eds.), Counterproductive work behavior: Investigations of actors and targets (pp. 151–174). Washington, DC: American Psychological Association.Google Scholar
  68. Starbuck, W. H. (in press). 60th anniversary essay: How journals could improve research practices in social science. Administrative Science Quarterly, 61, 165–183.  https://doi.org/10.1177/0001839216629644.
  69. Sterling, T. D. (1959). Publication decisions and their possible effects on inferences drawn from tests of significance—Or vice versa. Journal of the American Statistical Association, 54, 30–34.  https://doi.org/10.1080/01621459.1959.10501497.Google Scholar
  70. Tonidandel, S., & LeBreton, J. M. (2011). Relative importance analysis: A useful supplement to regression analysis. Journal of Business and Psychology, 26, 1–9.  https://doi.org/10.1007/s10869-010-9204-3.Google Scholar
  71. Tsang, E. W., & Kwan, K. M. (1999). Replication and theory development in organizational science: A critical realist perspective. Academy of Management Review, 24, 759–780.  https://doi.org/10.5465/AMR.1999.2553252.Google Scholar
  72. Viechtbauer, W. (2010). Conducting meta-analyses in R with the Metafor package. Journal of Statistical Software, 36(3), 1–48.  https://doi.org/10.18637/jss.v036.i03.Google Scholar
  73. Wagenmakers, E.-J., Wetzels, R., Borsboom, D., van der Maas, H. L., & Kievit, R. A. (2012). An agenda for purely confirmatory research. Perspectives on Psychological Science, 7, 632–638.  https://doi.org/10.1177/1745691612463078.Google Scholar
  74. Wicherts, J. M., Bakker, M., & Molenaar, D. (2011). Willingness to share research data is related to the strength of the evidence and the quality of reporting of statistical results. PLoS One, 6(11), e26828.  https://doi.org/10.1371/journal.pone.0026828.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Ernest O’Boyle
    • 1
    Email author
  • George C. Banks
    • 2
  • Kameron Carter
    • 3
  • Sheryl Walter
    • 1
  • Zhenyu Yuan
    • 3
  1. 1.Kelley School of BusinessIndiana UniversityBloomingtonUSA
  2. 2.Department of Management in the Belk College of Business at UNC CharlotteCharlotteUSA
  3. 3.Tippie College of BusinessUniversity of IowaIowa CityUSA

Personalised recommendations