Sequential sampling models without random between-trial variability: the racing diffusion model of speeded decision making


Most current sequential sampling models have random between-trial variability in their parameters. These sources of variability make the models more complex in order to fit response time data, do not provide any further explanation to how the data were generated, and have recently been criticised for allowing infinite flexibility in the models. To explore and test the need of between-trial variability parameters we develop a simple sequential sampling model of N-choice speeded decision making: the racing diffusion model. The model makes speeded decisions from a race of evidence accumulators that integrate information in a noisy fashion within a trial. The racing diffusion does not assume that any evidence accumulation process varies between trial, and so, the model provides alternative explanations of key response time phenomena, such as fast and slow error response times relative to correct response times. Overall, our paper gives good reason to rethink including between-trial variability parameters in sequential sampling models

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14


  1. 1.

    The distribution is defective because it is normalized to the probability of its associated response.

  2. 2.

    The DDM also contains within-trial drift rate variability, but because evidence for one response counts against evidence for the other response, only having within-trial noise leads to the model predicting equally fast correct and error response time distributions.

  3. 3.

    Note that the original authors found an increase of 19 ms rather than 10 ms in non-decision time between placebo and high alcohol doses.


  1. Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723.

    Google Scholar 

  2. Amano, K., Goda, N., Nishida, S., Ejima, Y., Takeda, T., & Ohtani, Y. (2006). Estimation of the timing of human visual perception from magnetoencephalography. The Journal of Neuroscience, 26(15), 3981–3991.

    PubMed  PubMed Central  Google Scholar 

  3. Anders, R., Alario, F., & van Maanen, L. (2016). The shifted Wald distribution for response time data analysis. Psychological Methods, 21(3), 309.

    PubMed  Google Scholar 

  4. Boehm, U., Annis, J., Frank, M. J., Hawkins, G. E., Heathcote, A., Kellen, D., & et al. (2018). Estimating across-trial variability parameters of the diffusion decision model: Expert advice and recom mendations. Journal of Mathematical Psychology, 87, 46–75.

    Google Scholar 

  5. Bompas, A., & Sumner, P. (2011). Saccadic inhibition reveals the timing of automatic and voluntary signals in the human brain. The Journal of Neuroscience, 31(35), 12501–12512.

    PubMed  PubMed Central  Google Scholar 

  6. Brown, S., & Heathcote, A. (2005). A ballistic model of choice response time. Psychological Review, 112(1), 117.

    PubMed  Google Scholar 

  7. Brown, S. D., & Heathcote, A. (2008). The simplest complete model of choice response time: Linear ballistic accumulation. Cognitive Psychology, 57(3), 153–178.

    PubMed  Google Scholar 

  8. Brown, S. D., Marley, A., Donkin, C., & Heathcote, A. (2008). An integrated model of choices and response times in absolute identification. Psychological Review, 115(2), 396.

    PubMed  Google Scholar 

  9. Cook, E. P., & Maunsell, J. H. (2002). Dynamics of neuronal responses in macaque MT and VIP during motion detection. Nature Neuroscience, 5(10), 985–994.

    PubMed  Google Scholar 

  10. Ditterich, J. (2006a). Evidence for time-variant decision making. European Journal of Neuroscience, 24(12), 3628–3641.

  11. Ditterich, J. (2006b). Stochastic models of decisions about motion direction: Behavior and physiology. Neural Networks, 19(8), 981–1012.

  12. Donkin, C., & Brown, S. D. (2018). Response times and decision making. In T. Wixted, & E. J. Wagenmakers (Eds.) The Stevens’ handbook of experimental psychology and cognitive neuroscience. (4th edn.), Vol. 5. New York: Wiley.

  13. Donkin, C., Brown, S. D., Heathcote, A., & Wagenmakers, E.-J. (2011). Diffusion versus linear ballistic accumulation: Different models but the same conclusions about psychological processes? Psychonomic Bulletin & Review, 18(1), 61–69.

    Google Scholar 

  14. Donkin, C., Heathcote, A., & Brown, S. (2009a). Is the linear ballistic accumulator model really the simplest model of choice response times: A Bayesian model complexity analysis. In 9th international conference on cognitive modeling—ICCM2009. Manchester, UK.

  15. Donkin, C., Heathcote, A., Brown, S., & Andrews, S. (2009b). Non-decision time effects in the lexical decision task. In Proceedings of the 31st annual conference of the cognitive science society. Austin: Cognitive Science Society.

  16. Donsker, M. D. (1951). An invariance principle for certain probability limit theorems. Memoirs of the American Mathematical Society.

  17. Egan, J. P. (1958). Recognition memory and the operating characteristic. USAF Operational Applications Laboratory Technical Note.

  18. Evans, N. J., Tillman, G., & Wagenmakers, E.-J. (in press). Systematic and random sources of variability in perceptual decision-making: Comment on Ratcliff, Voskuilen, and Mckoon (2018). Psychological Review.

  19. Fecteau, J. H., & Munoz, D. P. (2003). Exploring the consequences of the previous trial. Nature Reviews Neuroscience, 4(6), 435.

    PubMed  Google Scholar 

  20. Geisser, S., & Eddy, W. F. (1979). A predictive approach to model selection. Journal of the American Statistical Association, 74(365), 153–160.

    Google Scholar 

  21. Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457–511.

    Google Scholar 

  22. Gomez, P., Ratcliff, R., & Childers, R. (2015). Pointing, looking at, and pressing keys: A diffusion model account of response modality. Journal of Experimental Psychology: Human Perception and Performance, 41(6), 1515–1523.

    PubMed  Google Scholar 

  23. Hanes, D. P., & Schall, J. D. (1996). Neural control of voluntary movement initiation. Science, 274(5286), 427.

    PubMed  Google Scholar 

  24. Hawkins, G. E., Brown, S. D., Steyvers, M., & Wagenmakers, E.-J. (2012). An optimal adjustment procedure to minimize experiment time in decisions with multiple alternatives. Psychonomic Bulletin & Review, 19(2), 339–348.

    Google Scholar 

  25. Heathcote, A. (2004). Fitting Wald and ex-Wald distributions to response time data: An example using functions for the S-PLUS package. Behavior Research Methods, 36, 678–694.

    Google Scholar 

  26. Heathcote, A., & Hayes, B. (2012). Diffusion versus linear ballistic accumulation: Different models for response time with different conclusions about psychological mechanisms? Canadian Journal of Experimental Psychology/Revue canadienne de psychologie expérimentale, 66(2), 125–36.

    PubMed  Google Scholar 

  27. Heathcote, A., Wagenmakers, E. J., & Brown, S. D. (2014). The falsifiability of actual decision-making models.

  28. Hick, W. E. (1952). On the rate of gain of information. Quarterly Journal of Experimental Psychology, 4(1), 11–26.

    Google Scholar 

  29. Hyman, R. (1953). Stimulus information as a determinant of reaction time. Journal of Experimental Psychology, 45(3), 188.

    PubMed  Google Scholar 

  30. Jones, M., Curran, T., Mozer, M. C., & Wilder, M. H. (2013). Sequential effects in response time reveal learning mechanisms and event representations. Psychological Review, 120(3), 628.

    PubMed  Google Scholar 

  31. Jones, M., & Dzhafarov, E. N. (2014). Unfalsifiability and mutual translatability of major modeling schemes for choice reaction time. Psychological Review, 121(1), 1.

    PubMed  Google Scholar 

  32. Laming, D. R. J. (1968) Information theory of choice-reaction times. London: Academic Press.

    Google Scholar 

  33. Leite, F. P., & Ratcliff, R. (2010). Modeling reaction time and accuracy of multiple-alternative decisions. Attention, Perception, & Psychophysics, 72(1), 246–273.

    Google Scholar 

  34. Lerche, V., & Voss, A. (2016). Model complexity in diffusion modeling: Benefits of making the model more parsimonious. Frontiers in Psychology, 7.

  35. Link, S. W., & Heath, R. A. (1975). A sequential theory of psychological discrimination. Psychometrika, 40, 77–105.

    Google Scholar 

  36. Logan, G. D. (1988). Toward an instance theory of automatization. Psychological Review, 95(4), 492.

    Google Scholar 

  37. Logan, G. D., Van Zandt, T., Verbruggen, F., & Wagenmakers, E.-J. (2014). On the ability to inhibit thought and action: General and special theories of an act of control. Psychological Review, 121(1), 66–95.

    PubMed  Google Scholar 

  38. Matzke, D., & Wagenmakers, E.-J. (2009). Psychological interpretation of the ex-Gaussian and shifted Wald parameters: A diffusion model analysis. Psychonomic Bulletin & Review, 16(5), 798–817.

    Google Scholar 

  39. Osth, A. F., Dennis, S., & Heathcote, A. (in press). Likelihood ratio sequential sampling models of recognition memory. Cognitive Psychology.

  40. Osth, A. F., & Farrell, S. (2019). Using response time distributions and race models to characterize primacy and recency effects in free recall initiation. Psychological Review, 126(4), 578.

    PubMed  Google Scholar 

  41. Purcell, B. A., Heitz, R. P., Cohen, J. Y., Schall, J. D., Logan, G. D., & Palmeri, T. J. (2010). Neurally constrained modeling of perceptual decision making. Psychological Review, 117(4), 1113–1143.

    PubMed  PubMed Central  Google Scholar 

  42. Purcell, B. A., Schall, J. D., Logan, G. D., & Palmeri, T. J. (2012). From salience to saccades: Multiple-alternative gated stochastic accumulator model of visual search. The Journal of Neuroscience, 32(10), 3433–3446.

    PubMed  PubMed Central  Google Scholar 

  43. Raab, D. H. (1962). Division of psychology: Statistical facilitation of simple reaction times. Transactions of the New York Academy of Sciences, 24(5 Series II), 574–590.

    PubMed  Google Scholar 

  44. Rae, B., Heathcote, A., Donkin, C., Averell, L., & Brown, S. (2014). The hare and the tortoise: Emphasizing speed can change the evidence used to make decisions. Journal of Experimental Psychology: Learning, Memory and Cognition, 40(5), 1226–43.

    Google Scholar 

  45. Ratcliff, R. (1978). A theory of memory retrieval. Psychological Review, 85(2), 59–108.

    Google Scholar 

  46. Ratcliff, R. (2002). A diffusion model account of response time and accuracy in a brightness discrimination task: Fitting real data and failing to fit fake but plausible data. Psychonomic Bulletin & Review, 9(2), 278–291.

    Google Scholar 

  47. Ratcliff, R. (2013). Parameter variability and distributional assumptions in the diffusion model. Psychological Review, 120(1), 281–292.

    PubMed  Google Scholar 

  48. Ratcliff, R. (2015). Modeling one-choice and two-choice driving tasks. Attention, Perception, & Psychophysics, 77(6), 2134–2144.

    Google Scholar 

  49. Ratcliff, R., Gomez, P., & McKoon, G. M. (2004). A diffusion model account of the lexical decision task. Psychological Review, 111, 159–182.

    PubMed  PubMed Central  Google Scholar 

  50. Ratcliff, R., Philiastides, M. G., & Sajda, P. (2009). Quality of evidence for perceptual decision making is indexed by trial-to-trial variability of the EEG. Proceedings of the National Academy of Sciences, 106(16), 6539–6544.

    Google Scholar 

  51. Ratcliff, R., & Rouder, J. N. (1998). Modeling response times for two-choice decisions. Psychological Science, 9(5), 347–356.

    Google Scholar 

  52. Ratcliff, R., & Rouder, J. N. (2000). A diffusion model account of masking in two-choice letter identification. Journal of Experimental Psychology: Human Perception and Performance, 26(1), 127.

    PubMed  Google Scholar 

  53. Ratcliff, R., Sederberg, P. B., Smith, T. A., & Childers, R. (2016). A single trial analysis of EEG in recognition memory: Tracking the neural correlates of memory strength. Neuropsychologia, 93, 128–141.

    PubMed  PubMed Central  Google Scholar 

  54. Ratcliff, R., & Smith, P. L. (2004). A comparison of sequential sampling models for two-choice reaction time. Psychological Review, 111(2), 333–67.

    PubMed  PubMed Central  Google Scholar 

  55. Ratcliff, R. (2009). Modeling confidence and response time in recognition memory. Psychological Review, 116 (1), 59–83.

    PubMed  PubMed Central  Google Scholar 

  56. Ratcliff, R., & Starns, J. J. (2013). Modeling confidence judgments, response times, and multiple choices in decision making: Recognition memory and motion discrimination. Psychological Review, 120(3), 697.

    PubMed  PubMed Central  Google Scholar 

  57. Ratcliff, R., & Strayer, D. (2014). Modeling simple driving tasks with a one-boundary diffusion model. Psychonomic Bulletin & Review, 21(3), 577–589.

    Google Scholar 

  58. Ratcliff, R., & Tuerlinckx, F. (2002). Estimating parameters of the diffusion model: Approaching to dealing with contaminant reaction and parameter variability. Psychonomic Bulletin and Review, 9, 438–481.

    PubMed  Google Scholar 

  59. Ratcliff, R., & Van Dongen, H. P. (2011). Diffusion model for one-choice reaction-time tasks and the cognitive effects of sleep deprivation. Proceedings of the National Academy of Sciences, 108(27), 11285–11290.

    Google Scholar 

  60. Ratcliff, R., Van Zandt, T., & McKoon, G. (1999). Connectionist and diffusion models of reaction time. Psychological Review, 106(2), 261.

    PubMed  Google Scholar 

  61. Ratcliff, R., Voskuilen, C., & McKoon, G. (2018). Internal and external sources of variability in perceptual decision-making. Psychological Review.

  62. Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464.

    Google Scholar 

  63. Schwarz, W. (2001). The ex-Wald distribution as a descriptive model of response times. Behavior Research Methods, 33(4), 457–469.

    Google Scholar 

  64. Smith, P. L. (1995). Psychophysically principled models of visual simple reaction time. Psychological Review, 102(3), 567–593.

    Google Scholar 

  65. Smith, P. L., Ratcliff, R., & McKoon, G. (2014). The diffusion model is not a deterministic growth model: Comment on Jones and Dzhafarov (2014). Psychological Review, 121(4), 679–688.

    PubMed  PubMed Central  Google Scholar 

  66. Smith, P. L., & Vickers, D. (1988). The accumulator model of two-choice discrimination. Journal of Mathematical Psychology, 32(2), 135–168.

    Google Scholar 

  67. Starns, J. J. (2014). Using response time modeling to distinguish memory and decision processes in recognition and source tasks. Memory & Cognition, 42(8), 1357–1372.

    Google Scholar 

  68. Starns, J. J., & Ratcliff, R. (2014). Validating the unequal-variance assumption in recognition memory using response time distributions instead of ROC functions: A diffusion model analysis. Journal of Memory and Language, 70, 36–52.

    PubMed  PubMed Central  Google Scholar 

  69. Tandonnet, C., Burle, B., Hasbroucq, T., & Vidal, F. (2005). Spatial enhancement of EEG traces by surface Laplacian estimation: Comparison between local and global methods. Clinical Neurophysiology, 116(1), 18–24.

    PubMed  Google Scholar 

  70. Teller, D. Y. (1984). Linking propositions. Vision Research, 24(10), 1233–1246.

    PubMed  Google Scholar 

  71. Teodorescu, A. R., & Usher, M. (2013). Disentangling decision models: From independence to competition. Psychological Review, 120(1), 1–38.

    Article  PubMed  Google Scholar 

  72. Ter Braak, C. J. (2006). A Markov chain Monte Carlo version of the genetic algorithm differential evolution: Easy Bayesian computing for real parameter spaces. Statistics and Computing, 16(3), 239–249.

    Google Scholar 

  73. Tillman, G., Osth, A., van Ravenzwaaij, D., & Heathcote, A. (2017). A diffusion decision model analysis of evidence variability in the lexical decision task. Psychonomic Bulletin & Review, 24(6), 1949– 1956. Retrieved from

    Google Scholar 

  74. Tillman, G., Strayer, D., Eidels, A., & Heathcote, A. (2017). Modeling cognitive load effects of conversation between a passenger and driver. Attention, Perception, & Psychophysics, 79(6), 1795–1803.

    Google Scholar 

  75. Townsend, J. T., & Ashby, F. G. (1983). Stochastic modeling of elementary psychological processes. CUP Archive.

  76. Turner, B. M. (2019). Toward a common representational framework for adaptation. Psychological Review, 126 (5), 660.

    PubMed  Google Scholar 

  77. Turner, B. M., Gao, J., Koenig, S., Palfy, D., & McClelland, J. L. (2017). The dynamics of multimodal integration: The averaging diffusion model. Psychonomic Bulletin & Review, 24(6), 1819–1843.

    Google Scholar 

  78. Turner, B. M., Sederberg, P. B., Brown, S. D., & Steyvers, M. (2013). A method for efficiently sampling from distributions with correlated dimensions. Psychological Methods, 18(3), 368–84.

    PubMed  PubMed Central  Google Scholar 

  79. Turner, B. M., van Maanen, L., & Forstmann, B. U. (2015). Informing cognitive abstractions through neuroimaging: The neural drift diffusion model. Psychological Review, 122(2), 312–336.

    PubMed  Google Scholar 

  80. Usher, M., & McClelland, J. L. (2001). The time course of perceptual choice: The leaky competing accumulator model. Psychological Review, 108, 550–592.

    PubMed  Google Scholar 

  81. Usher, M., Olami, Z., & McClelland, J. L. (2002). Hick’s law in a stochastic race model with speed–accuracy tradeoff. Journal of Mathematical Psychology, 46(6), 704–715.

    Google Scholar 

  82. Van Maanen, L., Grasman, R. P., Forstmann, B. U., Keuken, M. C., Brown, S. D., & Wagenmakers, E.-J. (2012). Similarity and 1399 number of alternatives in the random-dot motion paradigm. Attention, Perception, & Psychophysics, 74(4), 739–753.

    Google Scholar 

  83. van Ravenzwaaij, D., Donkin, C., & Vandekerckhove, J. (2017). The EZ diffusion model provides a powerful test of simple empirical effects. Psychonomic Bulletin & Review, 24(2), 547–556.

    Google Scholar 

  84. van Ravenzwaaij, D., Dutilh, G., & Wagenmakers, E.-J. (2012). A diffusion model decomposition of the effects of alcohol on perceptual decision making. Psychopharmacology, 219(4), 1017–1025.

    PubMed  Google Scholar 

  85. Verdonck, S., & Tuerlinckx, F. (2015). Factoring out non-decision time in choice RT data: Theory and implications. Psychological Review, 123(2), 208–218.

    PubMed  Google Scholar 

  86. Vidal, F., Burle, B., Grapperon, J., & Hasbroucq, T. (2011). An ERP study of cognitive architecture and the insertion of mental processes: Donders revisited. Psychophysiology, 48(9), 1242–1251.

    PubMed  Google Scholar 

  87. Wald, A. (1947) Sequential analysis. New York: Wiley.

    Google Scholar 

  88. Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. The Journal of Machine Learning Research, 11, 3571–3594.

    Google Scholar 

  89. Woodman, G. F., Kang, M.-S., Thompson, K., & Schall, J. D. (2008). The effect of visual search efficiency on response preparation neurophysiological evidence for discrete flow. Psychological Science, 19(2), 128–136.

    PubMed  PubMed Central  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Gabriel Tillman.

Additional information

Author Note

This research was supported by National Eye Institute grant no R01 EY021833 and the Vanderbilt Vision Research Center (NEI P30-EY008126).

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Appendix A: Cumulative density function

The cumulative density function for the RDM was derived for Logan et al., (2014), but only the R code for the function was available. For completeness, we present the equation here.

The CDF for the RDM is a Wald distribution with variability in start point. The PDF for the Wald distribution is

$$ f(t ~|~ b,v) = \frac{b}{\sqrt{2 \pi t^{3}}} \exp \left\{ - \frac{1}{2} \frac{(v t - b)^{2}}{t} \right\}, $$

where b and v are the threshold and drift rate, respectively, of a diffusion process with a single absorbing boundary (b). This PDF can be integrated over t to give the corresponding CDF

$$ F(t~|~b,v) = {{\Phi}} \left( \frac{vt - b}{\sqrt{t}} \right) + \exp \left( 2 vb \right) {{\Phi}} \left( \frac{-vt-b}{\sqrt{t}} \right). $$

The starting point z < b of the process is implicitly equal to 0 in these expressions. Noting that a non-zero start point with threshold b is equivalent to a process with a zero start point and threshold bz, so we can write

$$ \begin{array}{@{}rcl@{}} F(t~|~b,v,z) &=& F(t~|~b-z,v) \\ &=& {{\Phi}} \left( \frac{vt - (b-z)}{\sqrt{t}} \right)+\exp \left( 2 v(b-z) \right)\\ && \times {{\Phi}} \left( \frac{-vt-(b-z)}{\sqrt{t}} \right). \end{array} $$

Our goal is to compute the CDF for the mixture the starting point z, where z follows a uniform[0,A] distribution. That is, we desire an expression for

$$ F(t~|~b,v,A) = \frac{1}{A} {{\int}_{0}^{A}} F(t~|~b,v,z) dz. $$

We begin by noting first that the CDF Φ of the standard normal distribution can be written as a transformation of the error function:

$$ {{\Phi}}(x) = \frac{1}{2} \left( \frac{x}{\sqrt{2}} \right) + \frac{1}{2}. $$

Therefore we can rewrite Eq. 11 as

$$ \begin{array}{@{}rcl@{}} F(t~|~b,v,z) \!\!&=&\!\!\frac{1}{2} \left[1 + \exp(2v(b-z)) + \left( \frac{vt - (b-z)}{\sqrt{2t}} \right) \right. \\ &&\!\!+\left. \exp \left( 2 v(b - z) \right) \left( \frac{-vt-(b-z)}{\sqrt{2t}} \right) \right]. \end{array} $$

Because the integration to be performed has a number of steps, for clarity we write

$$ F(t~|~b,v,z) = \frac{1}{2} \left[ \alpha(z) + \beta(z) + \gamma(z) + \delta(z) \right], $$


$$ \begin{array}{@{}rcl@{}} \alpha(z) &=& 1, \\ \beta(z) &=& \exp(2v(b-z)), \\ \gamma(z) &=& \left( \frac{vt - (b-z)}{\sqrt{2t}} \right), \text{ and} \\ \delta(z) &=& \exp(2v(b-z)) \left( \frac{-vt - (b-z)}{\sqrt{2t}} \right), \end{array} $$

and we will integrate each term over z and then add the results to obtain F(t | b,v,A).

α(z) and β(z)

The first two terms are trivial:

$$ \frac{1}{A} {{\int}_{0}^{A}} \alpha(z) dz = 1, $$


$$ \frac{1}{A} {{\int}_{0}^{A}} \beta(z) dz = \frac{\exp{2vb}}{2vA} \left( 1 - \exp(-2vA) \right). $$


We solve the integral of γ(z) by noting first that

$$ {{\int}_{0}^{x}} (u) du = x (x) + \frac{1}{\sqrt{\pi}} \left( \exp(-x^{2}) - 1 \right). $$

Through the change of variable

$$ u = \frac{vt - (b-z)}{\sqrt{2t}}, $$

we see that

$$ \begin{array}{@{}rcl@{}} \frac{1}{A} {{\int}_{0}^{A}} \gamma(z) dz & = & \frac{1}{A} {{\int}_{0}^{A}} \left( \frac{vt - (b-z)}{\sqrt{2t}} \right) dz \\ & = & \frac{\sqrt{2t}}{A} {\int}_{a_{1}}^{a_{2}} (u) du, \end{array} $$


$$ a_{1} = \frac{vt-b}{\sqrt{2t}} \text{ and } a_{2} = \frac{vt-(b-A)}{\sqrt{2t}}. $$

Equation 14 is equal to

$$ \frac{\sqrt{2t}}{A} \left[ {\int}_{0}^{a_{2}} (u) du - {\int}_{0}^{a_{1}} (u) du \right], $$

and the signs of a1 and a2 are irrelevant given the symmetry of the function (x). Because

$$ {\int}_{0}^{a_{i}} (u) du = a_{i} (a_{i}) + \frac{1}{\sqrt{\pi}} \left( \exp(-{a_{i}^{2}}) - 1 \right), $$

substitution into Eq. 15 yields

$$ \begin{array}{@{}rcl@{}} \frac{1}{A} {{\int}_{0}^{A}} \gamma(z) dz &=& \frac{\sqrt{2t}}{A}\left[a_{2}(a_{2})+ \frac{1}{\sqrt{\pi}} \left( \exp(-{a_{2}^{2}}) - 1 \right) \right.\\ &&\left.- a_{1} (a_{1}) - \frac{1}{\sqrt{\pi}} \left( \exp(-{a_{1}^{2}}) - 1 \right) \right]. \end{array} $$

Letting \(\alpha _{i} = \sqrt {2} a_{i}\) and substituting back the transformation of the error function in Eq. 12 gives

$$ \begin{array}{@{}rcl@{}} \frac{1}{A} {{\int}_{0}^{A}} \gamma(z) dz &=& -1+ \frac{2\sqrt{t}}{A} \left\{ \left[ \alpha_{2}{\Phi} \left( \alpha_{2} \right) - \alpha_{1} {\Phi} \left( \alpha_{1} \right) \right] \right.\\ &&\left.+ \left[ \phi \left( \alpha_{2} \right) - \phi \left( \alpha_{1} \right) \right]\right\}, \end{array} $$

where ϕ(x) is the standard normal PDF.


The function δ(z) must be integrated by parts. We first apply a change of variable as was used to integrate γ(z), that is,

$$ x = \frac{-vt - (b-z)}{\sqrt{2t}}. $$


$$ \begin{array}{@{}rcl@{}} \frac{1}{A} {{\int}_{0}^{A}} \delta(z) dz & = & \frac{1}{A} {{\int}_{0}^{A}} \exp(2v(b-z))\\ && \times \left( \frac{-vt - (b-z)}{\sqrt{2t}} \right) dz \\ & = & \frac{\sqrt{2t}}{A} {\int}_{b_{1}}^{b_{2}} \exp\left( -2v (x \sqrt{2t} + vt) \right) (x) dx \\ & = & \frac{\sqrt{2t}}{A} \exp \left( -2t v^{2} \right) \\ &&\times {\int}_{b_{1}}^{b_{2}} \exp\left( -2v \sqrt{2t}x\right) (x) dx, \end{array} $$


$$ b_{1} = \frac{-vt -b}{\sqrt{2t}} \text{ and } b_{2} = \frac{-vt-(b-A)}{\sqrt{2t}}. $$


$$ u = (x) \text{ and } v^{\prime} = \exp\left( -2v \sqrt{2t} x\right) $$

and noting that

$$ u^{\prime} = \frac{2}{\sqrt{\pi}} \exp \left( -x^{2} \right) \text{ and } v = - \frac{1}{2v \sqrt{2t}} \exp(-2v \sqrt{2t} x), $$

integrating by parts gives

$$ \begin{array}{@{}rcl@{}} &&\frac{1}{A} {{\int}_{0}^{A}} \delta(z) dz\\ & = & \frac{\sqrt{2t}}{A} \exp \left( -2t v^{2} \right) {\int}_{b_{1}}^{b_{2}} \exp \left( -2v \sqrt{2t} x \right) (x) dx \\ & = & \frac{\sqrt{2t}}{A}\exp \left( -2t v^{2} \right) \left[\left( - \frac{1}{2v \sqrt{2t}} \exp(-2v \sqrt{2t} b_{2}) (b_{2}) \right.\right.\\ & & \left.\left.+ \frac{1}{2v \sqrt{2t}} \exp(-2v \sqrt{2t} b_{1}) (b_{1}) \right) \right.\\ & & \left. + {\int}_{b_{1}}^{b_{2}} \frac{1}{v\sqrt{2t\pi}} \exp \left( -x^{2} -2v \sqrt{2t} x \right) dx \right] \\ & = & \frac{\sqrt{2t}}{A}\exp \left( -2t v^{2} \right) \left[\left( - \frac{1}{2v \sqrt{2t}} \exp(-2v \sqrt{2t} b_{2}) (b_{2}) \right.\right.\\ & & \left.\left.+ \frac{1}{2v \sqrt{2t}} \exp(-2v \sqrt{2t} b_{1}) (b_{1}) \right) \right.\\ & & \left. + \frac{1}{v \sqrt{t}} \exp(2tv^{2}) {\int}_{b_{1}}^{b_{2}} \frac{1}{\sqrt{2\pi}} \exp \left( - (x + v \sqrt{2t})^{2} \right) dx \right] \\ & = & \frac{\exp \left( -2t v^{2} \right)}{2vA} \left[\left( - \exp(-2v \sqrt{2t} b_{2}) (b_{2}) \right.\right.\\ & & \left.\left.+ \exp(-2v \sqrt{2t} b_{1}) (b_{1}) \right) \right.\\ & & \left. + \exp(2tv^{2}) \left( (b_{2} + v \sqrt{2t}) - (b_{1} + v\sqrt{2t}) \right) \right]. \end{array} $$

Transforming (x) back to Φ(x) and substituting \(\beta _{i} = \sqrt {2} b_{i}\), and recalling that \(\alpha _{i} = \sqrt {2} a_{i}\), we obtain

$$ \begin{array}{@{}rcl@{}} \lefteqn{\frac{1}{A} {{\int}_{0}^{A}} \delta(z) dz} \\ & = & \frac{1}{v A} \left\{\left[ {\Phi}\left( \alpha_{2} \right) - {\Phi}\left( \alpha_{1} \right) \right] - \left[ \exp\left( 2v (b-A) \right) {\Phi}\left( \beta_{2} \right) \right.\right.\\ &&\left.\left.- \exp\left( 2vb \right) {\Phi}\left( \beta_{1} \right) \right] \right. \\ & & \left. + \frac{1}{2} \left[\exp\left( 2v(b-A)\right) - \exp\left( 2vb\right)\right] \right\}. \end{array} $$

The solution

The CDF of the Wald distribution with uniform variability in start point is given by

$$ F(t~|~b,v,A) = \frac{1}{2A} {{\int}_{0}^{A}} \left( \alpha(z) + \beta(z) + \gamma(z) + \delta(z) \right) dz. $$

Adding the four integrals computed in Sections “α(z) and β(z)” through “δ(z)” gives

$$ \begin{array}{@{}rcl@{}} &&\frac{1}{A} {{\int}_{0}^{A}} \left( \alpha(z) + \beta(z) + \gamma(z) + \delta(z) \right) dz \\ & =& \frac{1}{vA} \left( {\Phi}(\alpha_{2}) - {\Phi}(\alpha_{1}) \right) + \frac{2 \sqrt{t}}{A} \left( \alpha_{2} {\Phi}(\alpha_{2}) - \alpha_{1} {\Phi}(\alpha_{1}) \right) \\ && -\frac{1}{vA} \left[\exp(2v(b-A)) {\Phi}(\beta_{2}) - \exp(2vb) {\Phi}(\beta_{1}) \right]\\ &&+ \frac{2 \sqrt{t}}{A} \left( \phi(\alpha_{2}) -\phi(\alpha_{1}) \right) . \end{array} $$


$$ \begin{array}{@{}rcl@{}} &&F(t~|~b,v,A)\\ & = & \frac{1}{2vA} \left( {\Phi}(\alpha_{2}) - {\Phi}(\alpha_{1}) \right) + \frac{ \sqrt{t}}{A} \left( \alpha_{2} {\Phi}(\alpha_{2}) - \alpha_{1} {\Phi}(\alpha_{1}) \right) \\ & & -\frac{1}{2vA} \left[\exp(2v(b-A)) {\Phi}(\beta_{2}) - \exp(2vb) {\Phi}(\beta_{1}) \right] \\ &&+ \frac{\sqrt{t}}{A} \left( \phi(\alpha_{2}) -\phi(\alpha_{1}) \right). \end{array} $$

Appendix B: Model structure and fitting method

Each parameter for each subject was stochastically dependent on a group level distribution, ϕ𝜃, where the subscript 𝜃 denotes the subject level parameter. We assumed that each group level distribution ϕ𝜃 was a truncated normal distribution, where \(\phi _{\theta } \sim N(\mu ,\sigma )\) | (lower,upper). For each group level distribution, the distribution range was the same as the hyper-priors on the group level mean parameter. We set hyper-priors on the group level parameters where the mean of ϕ𝜃 had the following priors: \( A \sim N(.5,.5) | (0,\infty )\), \( B \sim N(.5,.5) | (0,\infty )\), \( v \sim N(2,2) | (-\infty ,\infty )\), \( T_{0} \sim N(.3,.3) | (0,1)\). The standard deviation of ϕ𝜃 had a Gamma prior with shape parameter α and rate parameter β, where standard deviation \(\sim {\Gamma }(\alpha = 1, \beta = 1)\). Priors supported a range of values identified in a literature review reported by Matzke and Wagenmakers (2009).

We estimated posterior distributions of parameter values using the differential evolution Markov Chain Monte Carlo method (DE-MCMC; Ter Braak, 2006; Turner et al.,, 2013). DE-MCMC has been shown to efficiently estimate parameters of hierarchical versions of models similar to the RDM (e.g., Turner et al.,, 2015; Turner et al.,, 2013). For all model fits in the paper we ran the DE-MCMC algorithm with 40 chains. The starting points of these chains were drawn from the following distributions: \( A \sim N(.5,.5) | (0,\infty )\), \( B \sim N(.5,.5) | (0,\infty )\), \( v \sim N(2,2) | (-\infty ,\infty )\), \( T_{er} \sim N(.3,.3) | (0,1)\), where N(m,sd) indicates a normal distribution with mean m and standard distribution sd and the numbers after the | indicate the distribution range.

Part of approximating posterior distributions via sampling is deciding when convergence has been obtained, at which we are confident that samples represent the posterior distribution. All samples prior to convergence are discarded. To decide the point of convergence we both visually inspected the chains and discarded all samples prior to the \(\hat {R}\) statistic being less than 1.01 (Gelman & Rubin, 1992). \(\hat {R}\) represents the stability of parameter estimates within and between chains. The calculation involves comparing the within-chain and between-chain variances, as differences between the two sources of variance indicates a lack of stability in estimates, and potentially non-convergence. \(\hat {R}\) has a value upwards of 1 with values closer to 1 indicating that the variance between chains is similar to the variance within chains, and thus, indicating better convergence.

Upon reaching the \(\hat {R}\) criterion, we drew 5000 additional samples for each chain. To save memory during computing, and given the high auto-correlation within-chains, we thinned the posterior by only keeping every 10th iteration. These 20000 (i.e., 40 chains × 500 iterations) samples constituted our posterior distribution estimates. For all model fits the Wiener process standard deviation is fixed to s = 1. We also provide R code to implement the RDM model, which we host on the related Open Science Foundation web page:

Appendix C: Model selection method

The gold standard for estimating the out-of-sample predictive accuracy of a model is cross-validation (Geisser & Eddy, 1979), but this method is computationally expensive and therefore we used a computationally cheap approximation to cross-validation. We used the widely applicable information criterion (WAIC; Watanabe, 2010) to assess the out-of-sample predictive accuracy of the LBA and RDM. WAIC requires calculating a goodness-of-fit value of a model and subtracting a value from this that represents the complexity of the model. In this regard, WAIC is like the well-known Akaike’s information criterion (AIC; Akaike, 1974) or the Bayesian information criterion (BIC; Schwarz, 1978), but is applicable to hierarchical Bayesian models.

The first step to calculating WAIC is to compute for each posterior sample of post1,...,postS the likelihood of each data point yi from data y1,...,yN. For each data point, we calculate the average likelihood Pr(yi) over the entire posterior distribution as follows:

$$ Pr(y_{i}) = \frac{1}{S} \sum\limits_{s=1}^{S} Pr(y_{i} | \theta^{s}) $$

We then sum the log-likelihood over all data points to get the log point-wise predictive density, lppd, where:

$$ lppd = \sum\limits_{n=1}^{N} \log Pr(y_{i}). $$

The lppd is a biased estimate of how well the model predicts future data. It is biased because the data we use to evaluate the model is the same data we use to build the model. Essentially, in addition to fitting the signal in the data, the model has been optimized to fit noise in the data that will not be present in future data. The lppd overestimates the model’s predictive accuracy and to approximate an unbiased estimate we subtract a measure of the model’s complexity from lppd. One measure of complexity is the effective number of parameters pWAIC. The effective number of parameters is a count of the total number of model parameters, but the metric accounts for the fact that all parameters in the model do not contribute to model’s fit equally, and so, a parameter’s contribution to the count can be values between 0 and 1. To compute pWAIC, we first calculate the variance in log-likelihood across data points for each posterior sample, where:

$$ V(y_{i}) = Var_{s=1}^{S} (\log Pr(y_{i} | \theta^{s})). $$

We then sum the variance in log-likelihood over data points to get an estimate of the effective number of parameters, where:

$$ p_{WAIC} = \sum\limits_{n=1}^{N} V(y_{i}) $$

Using Eqs. 19 and 21 we can compute WAIC as follows:

$$ WAIC = -2(lppd - p_{WAIC}) $$

Appendix D: Model recovery

We generated a synthetic data set by simulating the RDM model for an experiment with easy, medium, and hard difficulty, where difficulty effects were generated from systematic changes in the drift rate parameter. Thirty subjects were generated from the group-level distribution and we fit all of the simulated data sets with the RDM model using same parameterization as the generating distributions. Presented in Fig. 15 we display the generating group-level parameter values superimposed on the recovered group-level posterior distributions. The generating values of the group level distribution fall within the group-level posterior estimates of each parameter. In Fig. 16, we show that the mean of the subject-level posterior distribution correlates well with the subject-level generating parameters. Both our group-level and subject level simulation show a good recovery of parameters for the RDM.

Fig. 15

Mean of group-level posteriors of generating parameters values are represented by the red triangle and the group-level posteriors of recovered parameter values are denoted by the blue histograms. Red triangles are all contained within the posterior distributions suggesting good recovery of parameters

Fig. 16

Scatterplots showing the mean of subject-level posteriors of generating parameters as a function of the mean subject-level recovered parameter values. Dots along the diagonal represent good recovery of parameters

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Tillman, G., Van Zandt, T. & Logan, G.D. Sequential sampling models without random between-trial variability: the racing diffusion model of speeded decision making. Psychon Bull Rev 27, 911–936 (2020).

Download citation


  • Response time
  • Sequential sampling models
  • Decision making