Skip to main content

What’s in a p? Reassessing Best Practices for Conducting and Reporting Hypothesis-Testing Research

  • Chapter
  • First Online:

Part of the book series: JIBS Special Collections ((JIBSSC))

Abstract

Social science research has recently been subject to considerable criticism regarding the validity and power of empirical tests published in leading journals, and business scholarship is no exception. Transparency and replicability of empirical findings are essential to build a cumulative body of scholarly knowledge. Yet current practices are under increased scrutiny to achieve these objectives. JIBS is therefore discussing and revising its editorial practices to enhance the validity of empirical research. In this editorial, we reflect on best practices with respect to conducting, reporting, and discussing the results of quantitative hypothesis-testing research, and we develop guidelines for authors to enhance the rigor of their empirical work. This will not only help readers to assess empirical evidence comprehensively, but also enable subsequent research to build a cumulative body of empirical knowledge.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    In many disciplines contributing to international business research, conventional Type 1 error probabilities are p < 0.05 or 0.01. There are situations where a higher Type 1 error probability, such as p < 0.10, might be justified (Cascio and Zedeck 1983; Aguinis et al. 2010), for example, when the dataset is small and a larger dataset is unrealistic to obtain.

  2. 2.

    Note that according to Dalton et al. (2012), the selection bias (or file-drawer problem) does not appear to affect correlation tables in published versus unpublished papers.

  3. 3.

    A “true” p-value would be the p-value observed in a regression analysis that was designed based on all available theoretical knowledge (e.g., regarding the measurement of variables and the inclusion of controls), and not changed after seeing the first regression results.

  4. 4.

    Brodeur et al. (2016) extensively test whether this assumption holds, as well as the sensitivity of the overall distribution to issues like rounding, the number of tests performed in each article, number of tables included, and many more. Similar to Brodeur et al. (2016), we explored the sensitivity of the shape of the distribution to such issues, and we have no reason to assume that the final result in Figure 4.1 is sensitive to these issues.

  5. 5.

    The spikes at z-scores of 3, 4, and 5 are the result of rounding and are an artefact of the data. As coefficients and standard errors reported in tables are rounded – often at 2 or 3 digits – very small coefficients and standard errors automatically imply ratios of rounded numbers, and as a consequence, result in a relatively large number of z-scores with the integer value of 3, 4, or 5. This observation is in line with the findings reported for Economics journals by Brodeur et al. (2016).

  6. 6.

    The data on which the graph is based are taken from Beugelsdijk et al. (2014).

  7. 7.

    If authors believe that certain suggested additional tests are not reasonable or not feasible (for example, because certain data do not exist), then they should communicate that in their reply. The editor then has to evaluate the merits of the arguments of authors and reviewers, if necessary bringing in an expert on a particular methodology at hand. If the latter is required, this can be indicated in the Manuscript Central submission process.

  8. 8.

    A laudable exception is the recent special issue of Strategic Management Journal on replication (Bettis et al. 2016b).

  9. 9.

    The grand total is heavily influenced by SMJ with 362 tested hypotheses, vis-à vis 164 in JIBS and 185 in Organization Science.

  10. 10.

    An interesting alternative may be abduction. For example, see Dikova, Parker, and van Witteloostuijn (2017), who define abduction as “as a form of logical inference that begins with an observation and concludes with a hypothesis that accounts for the observation, ideally seeking to find the simplest and most likely explanation.” See also, e.g., Misangyi and Acharya (2014).

References

  • Aguinis, H., S. Werner, J.L. Abbott, C. Angert, J.H. Park, and D. Kohlhausen. 2010. Customer-centric research: Reporting significant research results with rigor, relevance, and practical impact in mind. Organizational Research Methods 13 (3): 515–539.

    Article  Google Scholar 

  • Andersson, U., A. Cuervo-Cazurra, and B.B. Nielsen. 2014. Explaining interaction effects within and across levels of analysis. Journal of International Business Studies 45 (9): 1063–1071.

    Article  Google Scholar 

  • Angrist, J.D., and A. Krueger. 2001. Instrumental variables and the search for identification: Form supply and demand to natural experiments. Journal of Economic Perspectives 15 (4): 69–85.

    Article  Google Scholar 

  • Angrist, J.D., and J.S. Pischke. 2010. The credibility revolution in empirical economics: How better research design is taking the con out of econometrics. Journal of Economic Perspectives 24 (2): 3–30.

    Article  Google Scholar 

  • Antonakis, J., S. Bendahan, P. Jacquart, and R. Lalive. 2010. On making causal claims: A review and recommendations. Leadership Quarterly 21 (6): 1086–1120.

    Article  Google Scholar 

  • Barley, S.R. 2016. 60th anniversary essay: Ruminations on how we became a mystery house and how we might get out. Administrative Science Quarterly 61 (1): 1–8.

    Article  Google Scholar 

  • Bedeian, A.G., S.G. Taylor, and A. Miller. 2010. Management science on the credibility bubble: Cardinal sins and various misdemeanors. Academy of Management Learning & Education 9 (4): 715–725.

    Google Scholar 

  • Bettis, R.A. 2012. The search for asterisks: Compromised statistical tests and flawed theory. Strategic Management Journal 33 (1): 108–113.

    Article  Google Scholar 

  • Bettis, R.A., S. Ethiraj, A. Gambardella, C.E. Helfat, and W. Mitchell. 2016a. Creating repeatable cumulative knowledge in strategic management. Strategic Management Journal 37 (2): 257–261.

    Article  Google Scholar 

  • Bettis, R.A., C.E. Helfat, and M.J. Shaver. 2016b. Special issue: Replication in strategic management. Strategic Management Journal 37 (11): 2191–2388.

    Article  Google Scholar 

  • Beugelsdijk, S., H.L.F. de Groot, and A.B.T.M. van Schaik. 2004. Trust and economic growth: A robustness analysis. Oxford Economic Papers 56 (1): 118–134.

    Article  Google Scholar 

  • Beugelsdijk, S., A. Slangen, M. Onrust, A. van Hoorn, and R. Maseland. 2014. The impact of home-host cultural distance on foreign affiliate sales: The moderating role of cultural variation within host countries. Journal of Business Research 67 (8): 1638–1646.

    Article  Google Scholar 

  • Bhattacharjee, Y. 2013. The mind of a con man. New York Times Magazine, April 26.

    Google Scholar 

  • Bobko, P. 2001. Correlation and regression: Applications for industrial organizational psychology and management. 2nd ed. Thousand Oaks: Sage.

    Book  Google Scholar 

  • Bosco, F.A., H. Aguinis, K. Singh, J.G. Field, and C.A. Pierce. 2015. Correlational effect size benchmarks. Journal of Applied Psychology 100 (2): 431–449.

    Article  Google Scholar 

  • Bosco, F.A., H. Aguinis, J.G. Field, C.A. Pierce, and D.R. Dalton. 2016. HARKing’s threat to organizational research: Evidence from primary and meta – Analytic sources. Personnel Psychology 69 (3): 709–750.

    Article  Google Scholar 

  • Brambor, T., W.R. Clark, and M. Golder. 2006. Understanding interaction models: Improving empirical analyses. Political Analysis 14 (1): 63–82.

    Article  Google Scholar 

  • Branch, M. 2014. Malignant side-effects of null-hypothesis testing. Theory and Psychology 24 (2): 256–277.

    Article  Google Scholar 

  • Brodeur, A., M. Le, M. Sangnier, and Y. Zylberberg. 2016. Star wars: The empirics strike back. American Economic Journal: Applied Economics 8 (1): 1–32.

    Google Scholar 

  • Buckley, P., T. Devinney, and J.J. Louviere. 2007. Do managers behave the way theory suggests? A choice-theoretic examination of foreign direct investment location decision-making. Journal of International Business Studies 38 (7): 1069–1094.

    Article  Google Scholar 

  • Cascio, W.F., and S. Zedeck. 1983. Open a new window in rational research planning: Adjust alpha to maximize statistical power. Personnel Psychology 36 (3): 517–526.

    Article  Google Scholar 

  • Choi, J., and F. Contractor. 2016. Choosing an appropriate alliance governance mode: The role of institutional, cultural and geographic distance in international research & development (R&D) collaborations. Journal of International Business Studies 47 (2): 210–232.

    Article  Google Scholar 

  • Cohen, J. 1969. Statistical power analysis for the behavioral sciences. New York: Academic Press.

    Google Scholar 

  • Cortina, J.M., T. Kohler, and B.B. Nielsen. 2015. Restriction of variance interaction effects and their importance for international business. Journal of International Business Studies 46 (8): 879–885.

    Article  Google Scholar 

  • Crosswell, J.M., et al. 2009. Cumulative incidence of false positive results in repeated, multimodal cancer screening. Annals of Family Medicine 7 (3): 212–222.

    Article  Google Scholar 

  • Dalton, D.R., H. Aguinis, C.A. Dalton, F.A. Bosco, and C.A. Pierce. 2012. Revisiting the file drawer problem in meta-analysis: An empirical assessment of published and non-published correlation matrices. Personnel Psychology 65 (2): 221–249.

    Article  Google Scholar 

  • Dikova, D., S.C. Parker, and A. van Witteloostuijn. 2017. Capability, environment and internationalization fit, and financial and marketing performance of MNEs’ foreign subsidiaries: An abductive contingency approach. Cross-Cultural and Strategic Management 24 (3): 405–435.

    Article  Google Scholar 

  • Doh, J. 2015. Why we need phenomenon-based research in international business. Journal of World Business 50 (4): 609–611.

    Article  Google Scholar 

  • Doucouliagos, C., and T.D. Stanley. 2013. Are all economic facts greatly exaggerated? Theory competition and selectivity. Journal of Economic Surveys 27 (2): 316–339.

    Article  Google Scholar 

  • Economist. 2014. When science gets it wrong: Let the light shine in. June 14. http://www.economist.com/news/science–and–technology/21604089-two-big-recent-scientific-results-are-looking-shakyand-it-open-peer-review. Accessed 23 Mar 2017.

  • Ferguson, C.J., and M. Heene. 2012. A vast graveyard of undead theories: Publication bias and psychological science’s aversion to the null. Perspectives on Psychological Science 7 (6): 555–561.

    Article  Google Scholar 

  • Fisher, R.A. 1925. Statistical methods for research workers. Edinburgh: Oliver and Boyd.

    Google Scholar 

  • Fisher, R., and S. Schwartz. 2011. Whence differences in value priorities? Individual, cultural, and artefactual sources. Journal of Cross-Cultural Psychology 42 (7): 1127–1144.

    Article  Google Scholar 

  • Fox, P.J., and C.A.W. Glas. 2002. Modeling measurement error in a structural multilevel model. In Latent variable and latent structure models, ed. G.A. Marcoulides and I. Moustaki. London: Lawrence Erlbaum Associates.

    Google Scholar 

  • Gerber, A.S., D.P. Green, and D. Nickerson. 2001. Testing for publication bias in political science. Political Analysis 9 (4): 385–392.

    Article  Google Scholar 

  • Gigerenzer, G. 2004. Mindless statistics. Journal of Socio-Economics 33 (5): 587–606.

    Article  Google Scholar 

  • Goldfarb, B., and A. King. 2016. Scientific Apophenia in strategic management research: Significance tests & mistaken inference. Strategic Management Journal 37 (1): 167–176.

    Article  Google Scholar 

  • Gorg, H., and E. Strobl. 2001. Multinational companies and productivity spillovers: A meta-analysis with a test for publication bias. Economic Journal 111: F723–F739.

    Article  Google Scholar 

  • Greene, W. 2010. Testing hypotheses about interaction terms in nonlinear models. Economics Letters 107: 291–296.

    Article  Google Scholar 

  • Grieneisen, M.L., and M. Zhang. 2012. A comprehensive survey of retracted articles from the scholarly literature. PLoS One 7 (10): e44118. https://doi.org/10.1371/journal.pone.0044118.

    Article  Google Scholar 

  • Haans, R.F.P., C. Pieters, and Z.L. He. 2016. Thinking about U: Theorizing and testing U-and inverted U-shaped relationships in strategy research. Strategic Management Journal 37 (7): 1177–1196.

    Article  Google Scholar 

  • Head, M.L., L. Holman, R. Lanfear, A.T. Kahn, and M.D. Jennions. 2015. The extent and consequences of p–hacking in science. PLoS Biology 13 (3): e1002106. https://doi.org/10.1371/journal.pbio.1002106.

    Article  Google Scholar 

  • Henrich, J., S.J. Heine, and A. Norenzayan. 2010a. The weirdest people in the world? Behavioral and Brain Sciences 33 (2–3): 61–83.

    Article  Google Scholar 

  • ———. 2010b. Most people are not WEIRD. Nature 466: 29.

    Article  Google Scholar 

  • Hoetker, G. 2007. The use of logit and probit models in strategic management research: Critical issues. Strategic Management Journal 28 (4): 331–343.

    Article  Google Scholar 

  • Hubbard, R., D.E. Vetter, and E.L. Little. 1998. Replication in strategic management: Scientific testing for validity, generalizability, and usefulness. Strategic Management Journal 19 (3): 243–254.

    Article  Google Scholar 

  • Hunter, J.E., and F.L. Schmidt. 2015. Methods of meta-analysis: Correcting error and bias in research findings. 2nd ed. Thousand Oaks: Sage.

    Google Scholar 

  • Husted, B.W., I. Montiel, and P. Christmann. 2016. Effects of local legitimacy on certification decision to global and national CSR standards by multinational subsidiaries and domestic firms. Journal of International Business Studies 47 (3): 382–397.

    Article  Google Scholar 

  • Ioannidis, J.P.A. 2005. Why most published research findings are false. PLoS Medicine 2 (8): e124.

    Article  Google Scholar 

  • ———. 2012. Why science is not necessarily self-correcting. Perspectives on Psychological Science 7 (6): 645–654.

    Article  Google Scholar 

  • John, L.K., G. Loewenstein, and D. Prelec. 2012. Measuring the prevalence of questionable research practices with incentives for truth-telling. Psychological Science 23 (5): 524–532.

    Article  Google Scholar 

  • Kerr, N.L. 1998. HARKIng: Hypothesizing after results are known. Personality and Social Psychology Review 2 (3): 196–217.

    Article  Google Scholar 

  • Kingsley, A.F., T.G. Noordewier, and R.G. Vanden Bergh. 2017. Overstating and understating interaction results in international business research. Journal of World Business 52 (2): 286–295.

    Article  Google Scholar 

  • Kirk, R.E. 1996. Practical significance: A concept whose time has come. Educational and Psychological Measurement 56 (5): 746–759.

    Article  Google Scholar 

  • Leamer, E.E. 1985. Sensitivity analyses would help. American Economic Review 75 (3): 308–313.

    Google Scholar 

  • Lewin, A.Y., C.Y. Chiu, C.F. Fey, S.S. Levine, G. McDermott, J.P. Murmann, and E. Tsang. 2016. The critique of empirical social science: New policies at Management and Organization Review. Management and Organization Review 12 (4): 649–658.

    Article  Google Scholar 

  • Lexchin, J., L.A. Bero, B. Djulbegovic, and O. Clark. 2003. Pharmaceutical industry sponsorship and research outcome and quality: Systematic review. British Medical Journal 326 (7400): 1167–1170.

    Article  Google Scholar 

  • Masicampo, E.J., and D.R. Lalande. 2012. A peculiar prevalence of p-values just below 0.05. Quarterly Journal of Experimental Psychology 65 (11): 2271–2279.

    Article  Google Scholar 

  • McCloskey, D.N. 1985. The loss function has been mislaid: The rhetoric of significance tests. American Economic Review 75 (2): 201–205.

    Google Scholar 

  • McCloskey, D.N., and S.T. Ziliak. 1996. The standard error of regressions. Journal of Economic Literature 34: 97–114.

    Google Scholar 

  • Meyer, K.E. 2006. Asian management research needs more self-confidence. Asia Pacific Journal of Management 23 (2): 119–137.

    Article  Google Scholar 

  • ———. 2009. Motivating, testing, and publishing curvilinear effects in management research. Asia Pacific Journal of Management 26 (2): 187–193.

    Article  Google Scholar 

  • Misangyi, V.F., and A.G. Acharya. 2014. Substitutes or complements? A configurational examination of corporate governance mechanisms. Academy of Management Journal 57 (6): 1681–1705.

    Article  Google Scholar 

  • Mullane, K., and M. Williams. 2013. Bias in research: the rule rather than the exception? Elsevier Journal.http://editorsupdate.elsevier.com/issue-40-september-2013/bias-in-research-the-rule-rather-than-the-exception. Accessed 23 Mar 2017.

  • New York Times. 2011. Fraud case seen as a red flag for psychology research. November 2. http://www.nytimes.com/2011/11/03/health/research/noted-dutch-psychologist-stapel-accused-of-research-fraud.html?-r=1&ref=research. Accessed 15 Jan 2017.

  • Open Science Collaboration. 2015. Estimating the reproducibility of psychological science. Science. https://doi.org/10.1126/science.aac4716.

  • Orlitzky, M. 2012. How can significance tests be deinstitutionalized? Organizational Research Methods 15 (2): 199–228.

    Article  Google Scholar 

  • Pashler, H., and E.-J. Wagenmakers. 2012. Editors’ introduction to the special section on replicability in psychological science: A crisis of confidence? Perspectives on Psychological Science 7 (6): 528–530.

    Article  Google Scholar 

  • Peterson, M., J.L. Arregle, and X. Martin. 2012. Multi-level models in international business research. Journal of International Business Studies 43 (5): 451–457.

    Article  Google Scholar 

  • Pfeffer, J. 2007. A modest proposal: How we might change the process and product of managerial research. Academy of Management Journal 50 (6): 1334–1345.

    Article  Google Scholar 

  • Popper, K. 1959. The logic of scientific discovery. London: Hutchinson.

    Google Scholar 

  • Reeb, D., M. Sakakibara, and I.P. Mahmood. 2012. From the editors: Endogeneity in international business research. Journal of International Business Studies 43 (3): 211–218.

    Article  Google Scholar 

  • Rosenthal, R. 1979. The “file drawer problem” and tolerance for null results. Psychological Bulletin 86 (3): 638–641.

    Article  Google Scholar 

  • Rosnow, R.L., and R. Rosenthal. 1984. Understanding behavioral science: Research methods for customers. New York: McGraw-Hill.

    Google Scholar 

  • Rothstein, H.R., A.J. Sutton, and M. Borenstein. 2005. Publication bias in meta-analysis, prevention, assessment and adjustment. New York: Wiley.

    Book  Google Scholar 

  • Sala-i-Martin, X. 1997. I just ran two million regressions. American Economic Review 87 (2): 178–183.

    Google Scholar 

  • Shadish, W.R., T.D. Cook, and D. Campbell. 2002. Experimental and quasi-experimental designs for generalized causal inference. New York: Houghton Mifflin.

    Google Scholar 

  • Simmons, J.P., L.D. Nelson, and U. Simonsohn. 2011. Falsepositive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science 22 (11): 1359–1366.

    Article  Google Scholar 

  • Sterling, T.D. 1959. Publication decision and their possible effects on inferences drawn from tests of significance – Vice versa. Journal of the American Statistical Association 54 (285): 30–34.

    Google Scholar 

  • van Witteloostuijn, A. 2015. Toward experimental international business: Unraveling fundamental causal linkages. Cross Cultural & Strategic Management 22 (4): 530–544.

    Article  Google Scholar 

  • ———. 2016. What happened to Popperian falsification? Publishing neutral and negative findings. Cross Cultural & Strategic Management 23 (3): 481–508.

    Article  Google Scholar 

  • Wasserstein, R. L., and N. A. Lazar. 2016. The ASA’s statement on p-values: Context, process, and purpose. American Statistician, 70(2): 129–133. http://amstat.tandfonline.com/doi/full/10.1080/00031305.2016.1154108. (ASA = American Statistical Association).

    Article  Google Scholar 

  • Wiersema, M.F., and H.P. Bowen. 2009. The use of limited dependent variable techniques in strategy research: Issues and methods. Strategic Management Journal 30 (6): 679–692.

    Article  Google Scholar 

  • Williams, R. 2012. Using the margins command to estimate and interpret adjusted predictions and marginal effects. Stata Journal 12 (2): 308.

    Article  Google Scholar 

  • Wonnacott, T.H., and R.J. Wonnacott. 1990. Introductory statistics for business and economics. New York: Wiley.

    Google Scholar 

  • Zedeck, S. 2003. Editorial. Journal of Applied Psychology 88 (1): 3–5.

    Article  Google Scholar 

  • Zellmer-Bruhn, M., P. Caligiuri, and D. Thomas. 2016. From the editors: Experimental designs in international business research. Journal of International Business Studies 47 (4): 399–407.

    Article  Google Scholar 

  • Zelner, B. 2009. Using simulation to interpret results from logit, probit, and other nonlinear models. Strategic Management Journal 30 (12): 1335–1348.

    Article  Google Scholar 

Download references

Acknowledgements

We gratefully acknowledge the constructive comments from Editor-in-Chief Alain Verbeke, eleven editors of JIBS, as well as from Bas Bosma, Lin Cui, Rian Drogendijk, Saul Estrin, Anne-Wil Harzing, Jing Li, Richard Walker, and Tom Wansbeek. We also thank Divina Alexiou, Richard Haans, Johannes Kleinhempel, Sjanne van Oeveren, Britt van Veen, and Takumin Wang for their excellent research assistance. Sjoerd Beugelsdijk thanks the Netherlands Organization for Scientific Research (NWO grant VIDI 452-011-10). All three authors contributed equally to this editorial.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Klaus E. Meyer .

Editor information

Editors and Affiliations

Appendix 1: Stata Do File to Create Fig. 4.2

Appendix 1: Stata Do File to Create Fig. 4.2

Model:

  • Dependent variable = Y

  • Independent variable = X

  • Moderator variable = M

  • Interaction variable = X∗M

To generate Fig. 4.2:

  • predictnl me = _b[X] + _b[X∗M]∗M if e(sample),

  • se(seme)

  • gen pw1 = me–1.96∗seme

  • gen pw2 = me + 1.96∗seme

  • scatter me M if e(sample) || line me pw1 pw2 M if e(sample), pstyle(p2 p3 p3) sort legend(off) ytitle (“Marginal effect of X on Y”).

Rights and permissions

Reprints and permissions

Copyright information

© 2020 The Author(s)

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Meyer, K.E., van Witteloostuijn, A., Beugelsdijk, S. (2020). What’s in a p? Reassessing Best Practices for Conducting and Reporting Hypothesis-Testing Research. In: Eden, L., Nielsen, B.B., Verbeke, A. (eds) Research Methods in International Business. JIBS Special Collections. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-030-22113-3_4

Download citation

Publish with us

Policies and ethics