Between the Devil and the Deep Blue Sea: Tensions Between Scientific Judgement and Statistical Model Selection

Navarro, Danielle J.

doi:10.1007/s42113-018-0019-z

Between the Devil and the Deep Blue Sea: Tensions Between Scientific Judgement and Statistical Model Selection

Published: 27 November 2018

Volume 2, pages 28–34, (2019)
Cite this article

Computational Brain & Behavior Aims and scope Submit manuscript

Danielle J. Navarro ORCID: orcid.org/0000-0001-7648-6578¹

9649 Accesses
61 Citations
85 Altmetric
1 Mention
Explore all metrics

Abstract

Discussions of model selection in the psychological literature typically frame the issues as a question of statistical inference, with the goal being to determine which model makes the best predictions about data. Within this setting, advocates of leave-one-out cross-validation and Bayes factors disagree on precisely which prediction problem model selection questions should aim to answer. In this comment, I discuss some of these issues from a scientific perspective. What goal does model selection serve when all models are known to be systematically wrong? How might “toy problems” tell a misleading story? How does the scientific goal of explanation align with (or differ from) traditional statistical concerns? I do not offer answers to these questions, but hope to highlight the reasons why psychological researchers cannot avoid asking them.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Rejoinder: More Limitations of Bayesian Leave-One-Out Cross-Validation

Article Open access 15 January 2019

Quentin F. Gronau & Eric-Jan Wagenmakers

Limitations of “Limitations of Bayesian Leave-one-out Cross-Validation for Model Selection”

Article Open access 30 November 2018

Aki Vehtari, Daniel P. Simpson, … Andrew Gelman

Why the Best Predictive Models Are Often Different from the Best Explanatory Models: A Theoretical Explanation

Notes

While there are many people who assert that “a single failure is enough to falsify a theory”, I confess I have not yet encountered anyone willing to truly follow this principle in real life.
For instance, Gelman et al. (2003, pp. 586–587) present an analogous convergence result for the posterior distribution P(𝜃|x) within a single model . The result generalises to the Bayes factor by noting that the Bayes factor identifies a model with the prior predictive distribution P(x|). Substituting P(x|) for the role of P(x|𝜃) in their derivation produces the necessary result.
For the purposes of full disclosure, I should note that the precise situation from Lee and Navarro (2002) is quite a bit more complex than this description implies, and there are several details about how we had to adapt a model from one context to be applicable to the other have been omitted.

References

Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B.N., Petrov, & F., Csaki (Eds.) Second international symposium on information theory (pp. 267–281). Budapest: Akademiai Kiado.
Bernardo, J.M., & Smith, A.F.M. (2000). Bayesian theory, 2nd Edn. New York: John Wiley & Sons.
Google Scholar
Box, G.E.P. (1976). Science and statistics. Journal of the American Statistical Association, 71, 791–799.
Article Google Scholar
Browne, M. (2000). Cross-validation methods. Journal of Mathematical Psychology, 44, 108–132.
Article Google Scholar
Devezer, B., Nardin, L.G., Baumgaertner, B., Buzbas, E. (under review). Discovery of truth is not implied by reproducibility but facilitated by innovation and epistemic diversity in a model-centric framework. Manuscript submitted for publication. arXiv:1803.10118.
Edwards, W., Lindman, H., Savage, L.J. (1963). Bayesian statistical inference for psychological research. Psychological Review, 70, 193–242.
Article Google Scholar
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B. (2003). Bayesian Data Analysis, 2nd Edn. Boca Raton: Chapman & Hall/CRC.
Google Scholar
Grünwald, P. (2007). The minimum description length principle. Cambridge: MIT Press.
Book Google Scholar
Gronau, Q., & Wagenmakers, E.J. (2018). Limitations of Bayesian leave-one-out cross-validation for model selection. Computational Brain and Behavior.
Hayes, B.K., Banner, S., Forrester, S., Navarro, D.J. (under review). Sampling frames and inductive inference with censored evidence. Manuscript submitted for publication. https://doi.org/10.17605/OSF.IO/2M83V.
Kamin, L.J. (1969). Predictability, surprise, attention, and conditioning. In Campbell, B.A., & Church, R.M. (Eds.) Punishment and Aversive Behavior (pp. 279–296). New York: Appleton-Century-Crofts.
Kruschke, J.K. (1992). ALCOVE: An exemplar-based connectionist model of category learning. Psychological Review, 99(1), 22–44.
Article Google Scholar
Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B. (2015). Human-level concept learning through probabilistic program induction. Science, 350(6266), 1332–1338.
Article Google Scholar
Lattal, K.M., & Nakajima, S. (1998). Overexpectation in appetitive Pavlovian and instrumental conditioning. Animal Learning & Behavior, 26(3), 351–360.
Article Google Scholar
Lee, M.D. (2001a). On the complexity of additive clustering models. Journal of Mathematical Psychology, 45, 131–148.
Lee, M.D. (2001b). Determining the dimensionality of multidimensional scaling models for cognitive modeling. Journal of Mathematical Psychology, 45, 149–166.
Lee, M.D., & Navarro, D.J. (2002). Extending the ALCOVE model of category learning to featural stimulus domains. Psychonomic Bulletin & Review, 9, 43–58.
Article Google Scholar
Navarro, D.J. (2004). A note on the applied use of MDL approximations. Neural Computation, 16, 1763–1768.
Article Google Scholar
Navarro, D.J., Dry, M.J., Lee, M.D. (2012). Sampling assumptions in inductive generalization. Cognitive Science, 36, 187–223.
Article Google Scholar
Navarro, D.J., Pitt, M.A., Myung, I.J. (2004). Assessing the distinguishability of models and the informativeness of data. Cognitive Psychology, 49, 47–84.
Article Google Scholar
Pavlov, I. (1927). Conditioned reflexes. London: Oxford University Press.
Google Scholar
Pitt, M.A., Myung, I.J., Zhang, S. (2002). Toward a method of selecting among computational models of cognition. Psychological Review, 109, 472–491.
Article Google Scholar
Pitt, M.A., Kim, W., Navarro, D.J., Myung, J.I. (2006). Global model analysis by parameter space partitioning. Psychological Review, 113, 57–83.
Article Google Scholar
Rescorla, R.A. (1968). Probability of shock in the presence and absence of CS in fear conditioning. Journal of Comparative and Physiological Psychology, 66, 1–5.
Article Google Scholar
Rescorla, R.A. (1969). Conditioned inhibition of fear resulting from negative CS-US contingencies. Journal of Comparative and Physiological Psychology, 67, 504–509.
Article Google Scholar
Rescorla, R.A. (1971). Variations in the effectiveness of reinforcement following prior inhibitory conditioning. Learning and Motivation, 2, 113–123.
Article Google Scholar
Rescorla, R.A., & Wagner, A.R. (1972). A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement. In A.H., Black, & W.F., Prokasy (Eds.) Classical conditioning II: current research and theory (pp. 64–99). New York: Appleton-Century-Crofts.
Ransom, K., Perfors, A., Navarro, D.J. (2016). Leaping to conclusions: why premise relevance affects argument strength. Cognitive Science, 40, 1775–1796.
Article Google Scholar
Rissanen, J. (1996). Fisher information and stochastic complexity. IEEE Transactions on Information Theory, 42, 40–47.
Article Google Scholar
Schultz, W., Dayan, P., Montague, P.R. (1997). A neural substrate of prediction and reward. Science, 275(5306), 1593–1599.
Article Google Scholar
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461–464.
Article Google Scholar
Shao, J. (1993). Linear model selection by cross-validation. Journal of the American Statistical Association, 88, 486–494.
Article Google Scholar
Shiffrin, R. M., Borner, K., Stigler, S.M. (2018). Scientific progress despite irreproducibility: a seeming paradox. Proceedings of the National Academy of Sciences, USA, 115, 2632–2639.
Article Google Scholar
Tenenbaum, J.B., & Griffiths, T.L. (2001). Generalization, similarity, and Bayesian inference. Behavioral and Brain Sciences, 24, 629–640.
PubMed Google Scholar
Vehtari, A., Simpson, D., Yao, Y., Gelman, A. (2018). Limitations of Bayesian leave-one-out cross-validation. Computational Brain and Behavior.
Vehtari, A., & Ojanen, J. (2012). A survey of Bayesian predictive methods for model assessment, selection and comparison. Statistics Surveys, 6, 142–228.
Article Google Scholar
Voorspoels, W., Navarro, D.J., Perfors, A., Ransom, K., Storms, G. (2015). How do people learn from negative evidence? Non-monotonic generalizations and sampling assumptions in inductive reasoning. Cognitive Psychology, 81, 1–25.
Article Google Scholar
Wagenmakers, E. J. (2007). A practical solution to the pervasive problems of p values. Psychonomic Bulletin & Review, 14(5), 779–804.
Article Google Scholar
Wickelgren, W.A. (1972). Trace resistance and decay of long-term memory. Journal of Mathematical Psychology, 9, 418–455.
Article Google Scholar

Download references

Acknowledgements

I am grateful to many people for helpful conversations and comments that shaped this paper, most notably Nancy Briggs, Berna Devezer, Chris Donkin, Olivia Guest, Daniel Simpson, Iris van Rooij and Fred Westbrook.

Author information

Authors and Affiliations

School of Psychology, University of New South Wales, Kensington, Sydney, NSW, 2052, Australia
Danielle J. Navarro

Authors

Danielle J. Navarro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Danielle J. Navarro.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Navarro, D.J. Between the Devil and the Deep Blue Sea: Tensions Between Scientific Judgement and Statistical Model Selection. Comput Brain Behav 2, 28–34 (2019). https://doi.org/10.1007/s42113-018-0019-z

Download citation

Published: 27 November 2018
Issue Date: March 2019
DOI: https://doi.org/10.1007/s42113-018-0019-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Between the Devil and the Deep Blue Sea: Tensions Between Scientific Judgement and Statistical Model Selection

Abstract

Access this article

Similar content being viewed by others

Rejoinder: More Limitations of Bayesian Leave-One-Out Cross-Validation

Limitations of “Limitations of Bayesian Leave-one-out Cross-Validation for Model Selection”

Why the Best Predictive Models Are Often Different from the Best Explanatory Models: A Theoretical Explanation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Between the Devil and the Deep Blue Sea: Tensions Between Scientific Judgement and Statistical Model Selection

Abstract

Access this article

Similar content being viewed by others

Rejoinder: More Limitations of Bayesian Leave-One-Out Cross-Validation

Limitations of “Limitations of Bayesian Leave-one-out Cross-Validation for Model Selection”

Why the Best Predictive Models Are Often Different from the Best Explanatory Models: A Theoretical Explanation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation