1 Introduction

A minority game can be exemplified by the following simple market analogy; An odd number N of traders (agents) must at each time step choose between two options, buying or selling a share, with the aim of picking the minority group. If sell is in minority and buy in majority one may expect the price to go up to satisfy demand and vice versa if buy is in minority, thus motivating the minority character of the game. Clearly, there is no way to make everyone content, at least half of the agents will inevitably end up in the majority group each round. As the losing agents will try to improve their lot there is no static equilibrium. Instead, agents might be expected to adapt their buy or sell strategies based on perceived trends in the history of outcomes [112].

The Minority Game proposed by Challet and Zhang [2, 3] formalizes this type of market dynamics where agents of limited intellect compete for a scarce resource by adapting to the aggregate input of all others [1, 12]. Each agent has a set of strategies that, depending on the recent past history of minority groups going m time steps back, gives a prediction of the next minority being buy or sell. The agent uses at each time step her highest scoring strategy which has most accurately predicted correct minority groups historically. The state space of the game is given by the strategy scores of each agent together with the recent history of minority groups, and the discrete time evolution in this space represents an intricate dynamical system.

What makes the game appealing from a physics perspective is that it can be described using methods for the statistical physics of disordered systems, with the set of randomly assigned strategies corresponding to quenched disorder [5, 8, 1317]. In particular Challet, Marsili, and co-workers showed that the model can be formulated in terms of the gradient descent dynamics of an underlying Hamiltonian [13], plus noise. The asymptotic dynamics corresponds to minimizing the Hamiltonian with respect to the frequency at which agents use each strategy, a problem which in turn can be solved using the replica method [8, 17, 18]. In a complementary development Coolen solved the statistical dynamics of the problem in its full complexity using generating functionals [1416].

The game is controlled by the parameter \(\alpha =P/N\), where \(P=2^m\) is the number of distinct histories that agents take into account, which tunes the system through a phase transition (for \(N\rightarrow \infty \)) at a critical value \(\alpha _c=0.3374\ldots \). In the symmetric (or crowded) phase, \(\alpha < \alpha _c\), the game is quasi-periodic with period 2P where a given history gives alternately one or the other of the outcomes for minority group [4, 19]. A somewhat oversimplified characterization of the dynamics is that the information about the last winning minority group for a given history gives a crowding effect [20] where many agents want to repeat the last winning outcome which then counterproductively instead puts them in the majority group. The crowding also gives large fluctuations of the size of the minority group.

In the asymmetric (or dilute) phase, \(\alpha >\alpha _c\), agents are sufficiently uncorrelated that crowding effects are not important and there is no periodic behavior. Instead, as exemplified in Fig. 1 the score dynamics is random but with a net correlation between agents that makes fluctuations in the size of the minority group small. The dilute occupation of the full strategy space gives rise to a non-uniform frequency distribution of histories which can be beneficial for agents with strategies that are tuned to this asymmetry.

Fig. 1
figure 1

Evolution of strategy scores for the two strategies of four (\(i=1,\ldots 4\)) representative agents in a game with \(N=101\) agents and a memory of length \(m=7\) (\(P=2^7\)). At each time step every agent uses the one of her two strategies which has the highest momentary score, given by how well the strategy has predicted the past minority groups. The corresponding score difference \(x_i(t)\) (inset) shows the distinction between frozen agents that consistently use a single strategy, and fickle agents that switch between strategies

In this paper we study the dynamics of the Minority Game in the asymmetric phase by formulating a simplified statistical model, focusing on finding probability distributions for the relative strategy scores. In particular, we study the original formulation of the game with sign-payoff for which quantitative results are challenging to derive. By sorting the strategies based on how strongly they are correlated with the average over all strategies in the game, we find that sufficient statistical information can be extracted to formulate a quantitatively accurate model for \(\alpha \gtrsim 1\).

We discuss how the relative score for each agent can be derived from the master equation of a random walk on a chain with asymmetric jump probabilities to nearest neighbor sites, and how these jump probabilities can be calculated from the basic dynamic update equation of the scores. The corresponding probability distributions of scores are either of the form of exponential localization or diffusion with a drift. In the appendices we show that the model is related to but independent from the Hamiltonian formulation and we show how it can also be readily applied to the game with linear payoff where the master equation has long-range hopping.

Although the MG is well understood from the classic works discussed above, it is our hope that the simplified model of the steady state attendance and score distributions presented in this paper provides an alternative and readily accessible perspective on this fascinating model.

2 Definition of the Game and Outline

In order to give an overview of our results and for completeness we start by providing the formal definition of the Minority Game and some basic properties [2, 3, 10, 11].

At each discrete time step every agent gives a binary bid \(a_i(t)=\pm 1\), all of which are collected into a total attendance

$$\begin{aligned} A_t=\sum _{i=1}^Na_i(t)=-N,\ldots ,N\,, \end{aligned}$$
(1)

(N odd) and the winning minority group is then identified through \(-\text {sign}(A_t)\). A binary string of the m past winning bids, called a history \(\mu \), is provided as global information to each agent upon which to base her decision for the following round. There are thus \(\mu =1,\ldots ,P\) with \(P=2^m\) different histories. At her disposal each agent has two randomly assigned strategies (a.k.a. strategy tables) that provide a unique bid for each history. The bid of strategy \(j=1,2\) of agent \(i=1,\ldots ,N\) in response to history \(\mu \) is given by \(a_{i,j}^\mu =\pm 1\) and the full strategy is the P dimensional random binary vector \(\vec {a}_{i,j}\). There are thus a total of \(2^P\) distinct strategies available.

The agent uses at each time step the strategy that has made the best predictions for minority group historically. This is decided by a score \(U_{i,j}(t)\) for each strategy which is updated according to \(U_{i,j}(t+1)=U_{i,j}(t)-a_{i,j}^{\mu }\text {sign}(A^{\mu }_t)\), irrespectively of the strategy actually being used or not. (Here the superscript \(\mu \) on \(A_t\) just indicates that the attendance will depend on the history \(\mu (t)\) giving the bids at time t.) Ties, i.e. \(U_{i,1}=U_{i,2}\), are decided by a coin toss.

Since it is only the relative score between an agent’s two strategies that is important in deciding which strategy to use, one may focus on the relative score

$$\begin{aligned} x_i(t)=(U_{i,1}(t)-U_{i,2}(t))/2\,. \end{aligned}$$
(2)

This is updated according to

$$\begin{aligned} x_i(t+1)=x_i(t)+\Delta _i(t)\,, \end{aligned}$$
(3)

where

$$\begin{aligned} \Delta _i(t)=-\xi _i^\mu \text {sign} (A^\mu _t)\,. \end{aligned}$$
(4)

and where \(\vec {\xi }_i=(\vec {a}_{i,1}-\vec {a}_{i,2})/2\) is an agents “difference vector” that takes values \(\pm 1\) or 0 for each history \(\mu \).

To make the dynamics generated by these equations more concrete, Fig. 1 shows the scores of the strategies of four particular agents \(U_{i,1/2}\), \(i=1,\ldots ,4\) for one realization of a game with \(N=101\), \(P=2^7\), together with the corresponding relative scores \(x_i\) (inset), over a limited time interval. As exemplified by this figure agents come in two flavors, known as ”frozen” and ”fickle” [5, 14]. An agent is frozen if one of her strategies performs consistently better than the other, such that on average the score difference is diverging, whereas fickle agents have a relative score that meanders around \(x=0\) switching their used strategy. The motion of \(x_i\) for both fickle and frozen agents is a random walk with a bias towards or away from \(x=0\). A basic problem is to characterize and understand this random walk and derive the corresponding probability distribution \(P_i(x,t)\); the probability to find agent i at position x at time t [10, 16].

2.1 Outline and Results

As presented in Sect. 3 we can quantify the correlation between an agent’s strategies, specified by \(\xi _i^\mu \), and the total attendance \(A_t^\mu \), which in turn allows for characterizing the mean (time averaged) step size \(\Delta _i=\langle x_i(t+1)-x_i(t)\rangle \) in terms of a distribution over agents \(P(\Delta _i)\). In agreement with earlier work we find that \(\Delta _i\) has two contributions; one center (\(x=0\)) seeking bias term which arises from self interaction (the used strategy contributes to the attendance and as such is more likely to be in the majority group [17]) and a fitness term which reflects the relative adaptation of the agent’s two strategies to the time averaged stochastic environment of the game. The distribution of step sizes over the population of agents are shown in Fig. 3 where frozen agents are simply those where the fitness overcomes the bias, such that \(\Delta _i>0\) for \(x>0\) or \(\Delta _i<0\) for \(x<0\), whereas for fickle agents \(\Delta _i<0\) for \(x>0\) and vice versa.

Knowing the mean step size of an agent allows for a formulation in terms of a one dimensional random walk (Fig. 4) with corresponding jump probabilities, as presented in Sect. 4. Depending on whether it is more likely to jump towards the center or not (fickle or frozen respectively) the master equation on the chain can be solved in terms of a stationary exponential distribution centered at \(x=0\) or (in the continuum limit) a normal distribution with a variance and mean that grow linearly in time (diffusion with drift). These are the distributions \(P_i(x,t)\) depending on \(\Delta _i\).

In simulations over many agents it is natural to consider the full distribution \(P(x,t)=\sum _{i=1}^{N} P_i(x,t)/N=\int P(\Delta _i)P_i(x,t)d\Delta _i\), with NP(xt) thus the probability of finding an agent at time t with relative score x. In terms of scaled coordinates \(x/\sqrt{N}\) and t / N we find that the distribution only depends on \(\alpha \). The model distributions show excellent agreement with direct numerical simulations (Figs. 5 and 6) with no fitting parameters. This result for the full distribution of relative scores together with its systematic derivation for the original sign-payoff game represent the main results of this paper.

In Appendix 2 we discuss the relation between the model presented in this work and the formulation in terms of a minimization problem of a Hamiltonian generator of the asymptotic dynamics [8, 13]. We find that one way to view the present model is as a reduced ansatz for the ground state where the only parameters are the fraction of positively and negatively frozen agents (solved for self-consistently) instead of the full space of the frequency of use of each strategy. With this ansatz closed expressions can be derived for the steady state distributions irrespective of the form of the Hamiltonian.

In Appendix 3 we show how the model applies to the game with linear payoff \(\Delta _i(t)=-\xi _i^\mu A^\mu _t\).

3 Statistical Model

We will now turn to describing the statistical model in some detail and derive the results discussed in the previous section. We define for each agent the sum and difference of strategies for each bid \(\vec {\omega }_i=(\vec {a}_{i,1}+\vec {a}_{i,2})/2\) and (as discussed above) \(\vec {\xi }_i=(\vec {a}_{i,1}-\vec {a}_{i,2})/2\) [5]. Clearly \(\omega _i^\mu \), being the sum of two random numbers \(\pm 1\) is distributed over \((-1,0,1)\) with probability (1 / 4, 1 / 2, 1 / 4). A non-zero value of \(\omega _i^\mu \) means that agent i always has the same bid for history \(\mu \) independently of which strategy it has in play. The sum over all agents, \(\vec {\Omega }=\sum _{i=1}^N\vec {\omega }_i\), thus gives a constant history dependent but time independent background contribution to the attendance. (In the sense that every time history \(\mu \) occurs in the time series it gives the same contribution.) This background \(\Omega ^\mu \) is, for large N, normally distributed with mean zero and variance

$$\begin{aligned} \sigma _\Omega ^2=N/2\,. \end{aligned}$$

An interesting property of the Minority Game is that there is a “\(Z_2\) gauge” freedom with respect to an arbitrary choice of which is called strategy 1 and which is 2, thus corresponding to a change of sign of \(\vec {\xi }_i\). Such a sign change will simply result in a change of sign of \(x_i(t)\) having no consequence on which strategy is actually in play. (It is the strategy in play which is an observable, not whether it is labeled by 1 or 2.) Nevertheless, it turns out that making a consistent definition of the order of strategies is helpful in formulating a simple statistical model. Explicitly we order the two strategies (“fix the gauge”) of all agents i such that

$$\begin{aligned} \vec {\xi }_i\cdot \vec {\Omega }\le 0\,. \end{aligned}$$
(5)

Shortly we will describe the distribution over agents of \(\xi _i^\mu \), to quantify its anticorrelation with \(\Omega _i^\mu \).

To proceed we write the attendance at a time step t with history \(\mu \) as

$$\begin{aligned} A^\mu _t=\Omega ^\mu +\sum _i\xi ^\mu _i s_i(t)\,, \end{aligned}$$
(6)

where \(s_i(t)=\pm 1\) depending on which strategy agent i is playing [5]. Again, the relative strategy score \(x_i\) of agent i is updated according to Eq. 4. Given the background contribution to the attendance \(\vec {\Omega }\) we expect there to be a surplus of \(s_i=1\) in the steady state with our choice of gauge because the strategy 1 is expected to be favored by the score update function. (In other words, strategy 1 is expected to have a higher fitness.) However, this correlation is not trivial as the accumulated score also depends on the dynamically generated contribution the attendance. As discussed previously some fraction \(\phi \) of the agents are frozen, in the sense of always using the same strategy, \(s_i=\text {constant}\). We make an additional distinction (made significant by our choice of gauge) and separate the group of frozen agents into those with \(s_i(t)=1\) (fraction \(\phi _1\)), and those with \(s_i(t)=-1\) (fraction \(\phi _2\)), such that \(\phi =\phi _1+\phi _2\). Clearly, we expect the former to be more plentiful than the latter.

We will now derive steady state distributions over agents for the mean step size \(\Delta _i\). For this purpose we will write the attendance as

$$\begin{aligned} A^{\mu }_t=\Omega ^{\mu }+X^{\mu }+Y^\mu +S_t\,, \end{aligned}$$
(7)

where

$$\begin{aligned} X^\mu= & {} \sum _{i \in \phi _1}\xi ^\mu _i \end{aligned}$$
(8)
$$\begin{aligned} Y^\mu= & {} -\sum _{i \in \phi _2}\xi ^\mu _i \end{aligned}$$
(9)
$$\begin{aligned} S_t= & {} \sum _{i \text { fickle}}\xi _i^\mu s_i(t)\,, \end{aligned}$$
(10)

corresponding to the three categories of agents discussed previously. We will make the following simplifying approximations for these three components: the fickle component we will model as completely disordered, such that \(s_i(t)=\pm 1\) is random, and correspondingly (for large N) \(S_t\) is normally distributed with mean zero and variance

$$\begin{aligned} \sigma _s^2=\varphi N/2, \end{aligned}$$

with \(\varphi =(1-\phi _1-\phi _2)\) the fraction of fickle agents. (Thus, neglecting that the fickle agents would also have a net anticorrelation with the background \(\vec {\Omega }\)). We will assume the frozen agents to simply be a sum of independent random variables drawn from the distribution of \(\vec {\xi }\), thus neglecting that the agents that are frozen may come from the extremes of this distribution.

To proceed, we need to find the distribution of \(\vec {\xi }_i\), i.e. how it varies over the set of agents. (Henceforth we will usually drop the index i and regard the objects as drawn from a distribution.) Begin by defining \(\vec {\psi }=\text {Random}(\pm 1)\vec {\xi }\), which is thus disordered with respect to the sign of \(\vec {\Omega }\cdot \vec {\psi }\) Footnote 1. The object \(\psi ^\mu \) is independent of \(\Omega ^\mu \) (ignoring 1 / N corrections due to \(\Omega ^\mu \ne 0\) limiting the available bids \(\pm 1\)), taking values \((1,0,-1)\) with probability (1 / 4, 1 / 2, 1 / 4), which gives mean zero and variance 1 / 2. Consider the joint object \(h=\frac{1}{P}\vec {\Omega }\cdot \vec {\psi }\), for large P this becomes normally distributed with mean zero and variance \(\sigma _h^2=\frac{1}{P}(N/2)(1/2)=1/(4\alpha )\) [5].

Now, to quantify the correlation between \(\vec {\xi }\) and \(\vec {\Omega }\) we define the object

$$\begin{aligned} \tilde{h}=\frac{1}{P}\vec {\Omega }\cdot \vec {\xi }=-|h| \end{aligned}$$

which consequently has mean \(<\tilde{h}>=-\int dhP(h)|h|=-1/\sqrt{2\pi \alpha }\) and \(<\tilde{h}^2>=\sigma _h^2\). We will represent this distribution by assuming that each component \(\xi ^\mu \) are independent Gaussian random variables with a mean that is linearly dependent on \(\Omega ^\mu \). With this assumption we find the conditional distribution

$$\begin{aligned} P_{\xi ^\mu |\Omega ^\mu }=\mathcal{N}_{\xi ^\mu }(-c(\alpha )\Omega ^\mu /N,\sigma _\xi )\,, \end{aligned}$$
(11)

where \(c(\alpha )=\sqrt{\frac{2}{\pi \alpha }}\), and \(\sigma ^2_{\xi }=1/2\), and where we write the normal distribution over x with mean \(\mu \) and variance \(\sigma ^2\) as \(\mathcal{N}_x(\mu ,\sigma )=\frac{1}{\sqrt{2\pi }\sigma }e^{-(x-\mu )^2/2\sigma ^2}\). This quantifies that \(\xi ^\mu \) is on average anticorrelated with \(\Omega ^\mu \) which is expected to place strategy 1 in the minority group more often than strategy 2.

Using Eq. 11 we can also calculate the distributions of \(X^\mu \) (\(Y^\mu \)) as the sum of \(\phi _1 N\) (\(\phi _2 N\)) correlated objects \(\xi _i^\mu \), giving

$$\begin{aligned} P_{X^\mu |\Omega ^\mu }= & {} \mathcal{N}_{X^\mu }(-c(\alpha )\phi _1\Omega ^\mu ,\sigma _{X|\Omega })\end{aligned}$$
(12)
$$\begin{aligned} P_{Y^\mu |\Omega ^\mu }= & {} \mathcal{N}_{Y^\mu }(c(\alpha )\phi _2\Omega ^\mu ,\sigma _{Y|\Omega })\,, \end{aligned}$$
(13)

with conditional variances \(\sigma _{X|\Omega }^2=\phi _1N/2\) and \(\sigma _{Y|\Omega }^2=\phi _2N/2\).

3.1 Distribution of Step Sizes

Given the model expressions for the distributions of all the components of the score update equation (Eq. 4) we will find the distribution of mean (time averaged) step sizes. As a first step we integrate out the fast variable \(S_t\) to get a conditional on \(\mu \) time averaged step size \(\Delta ^\mu =\langle \Delta (t)|\mu \rangle \). (Over a long time series of the game every history \(\mu \) will occur many times, we thus average over all those occurrences of a single history.) This corresponds to

$$\begin{aligned} \Delta ^\mu= & {} -\xi ^\mu \int dS P(S) \big [\text {sign}(\Omega ^\mu +X^\mu +Y^\mu +S)\nonumber \\&+\,\text {sign}(x) \xi ^\mu \delta \big (\tfrac{1}{2}(\Omega ^\mu +X^\mu +Y^\mu +S)\big )\big ]\,. \end{aligned}$$
(14)

The second term, which is a self-interaction, follows from the discrete nature of the original problem. It gives a negative bias for the used strategy coming from the fact that if the net attendance from all other agents is zero, the used strategy puts the agent in the majority group. (The factor \(\frac{1}{2}\) in the delta function is to account for the fact that the attendance, as defined in Eq. 1, changes in steps of two and the factor \(\text {sign}(x) \xi ^\mu \) comes from the fact that only the used strategy enters the attendance.) Integrated this gives

$$\begin{aligned} \Delta ^\mu= & {} \Delta _{\text {fit}}^\mu +\Delta _{\text {bias}}^\mu \nonumber \\= & {} -\xi ^\mu \text {erf}\left( \frac{\Omega ^\mu +X^\mu +Y^\mu }{\sqrt{2}\sigma _S}\right) \nonumber \\&-\text {sign}(x)(\xi ^{\mu })^2\sqrt{\frac{2}{\pi }}\frac{1}{\sigma _S}e^{-(\Omega ^\mu +X^\mu +Y^\mu )^2/2\sigma _S^2}\,, \end{aligned}$$
(15)

where we have identified the first term as a fitness \(\Delta _{\text {fit}}\) which quantifies the relative fitness of the agent’s two strategies and the second as a negative bias \(\Delta _{\text {bias}}\) for the used strategy as discussed previously.

To calculate the distribution of mean step sizes we will assume that histories occur with the same frequency such that \(\Delta =\frac{1}{P}\sum _\mu \Delta ^\mu \). This is in fact not the case for a single realization of the game in the dilute phase, some histories occur more often than others, as one can see directly from any simulation in this regime. Nevertheless, for large P we will assume that this variation of occurrences of \(\mu \) averages out. As discussed extensively in the literature the overall behavior of the game is insensitive to whether the actual history is used (endogenous information) as input to the agents or if a random history is supplied (exogenous information) [10, 11, 16, 21, 22]. This is also confirmed by the present work through the good agreement between the model using exogenous information and simulations in which we use the actual history.

Assuming large P and given the assumption of independence of the distributions \(\Omega ,\xi ,X,Y\) for different \(\mu \) we expect the distribution \(P(\Delta )\) to approach a Gaussian (by the central limit theorem) with mean

$$\begin{aligned} \bar{\Delta }=\int d\Omega d\xi dX dY P_\Omega P_{\xi |\Omega } P_{X|\Omega } P_{Y|\Omega } \Delta ^\mu \,, \end{aligned}$$
(16)

with \(\Delta ^\mu \) as in Eq. 15, and with variance \(\sigma ^2=\frac{1}{P}(\overline{\Delta ^2}-\bar{\Delta }^2)\).

The integrals are readily done analytically as described in the Appendix 1, but the expressions are very lengthy. The main features can be expressed in the following form:

$$\begin{aligned} \bar{\Delta }_{\text {bias}}= & {} -\text {sign}(x)\frac{1}{\sqrt{N}}\tilde{\Delta }_\mathrm{bias}(\alpha ,\phi _1,\phi _2)\nonumber \\ \bar{\Delta }_{\text {fit}}= & {} \frac{1}{\sqrt{\alpha N}}\tilde{\Delta }_{\mathrm{{fit}}}(\alpha ,\phi _1,\phi _2)\,, \end{aligned}$$
(17)

where \(\tilde{\Delta }_{\text {bias}/\text {fit}}>0\) are functions that only depend on N and P through \(\alpha =P/N\), change slowly as a function of the arguments in the physically relevant regime \(0\le \phi _1+\phi _2\le 1\) (Fig. 7) and which satisfy \(\tilde{\Delta }_{\text {bias}}(\alpha ,0,0)=\frac{1}{\sqrt{2\pi }}\) and \(\tilde{\Delta }_{\text {fit}}(\alpha ,0,0)=\frac{1}{\pi }\). As seen from Eq. 17, the mean bias is towards \(x=0\), the used strategy is penalized, while the mean fitness is positive acting to increase the relative score x, consistent with our choice of gauge as discussed earlier.

The only appreciable contribution to the variance comes from the fitness term scaling as 1 / P whereas the bias has a variance that scales with 1 / (NP) and thus negligible (as is the cross term). The variance can be written

$$\begin{aligned} \sigma ^2_{\text {bias}}= & {} 0 \end{aligned}$$
(18)
$$\begin{aligned} \sigma ^2_{\text {fit}}= & {} \frac{1}{\alpha N}\tilde{\sigma }^2(\alpha ,\phi _1,\phi _2)\,, \end{aligned}$$
(19)

where \(\tilde{\sigma }>0\) also changes slowly in the relevant regime (Fig. 7) and satisfies \(\tilde{\sigma }(\alpha ,0,0)=\frac{1}{\sqrt{6}}\). The width of the fitness distribution explains the fact that even though \(\bar{\Delta }_\mathrm{fit}>0\) consistent with \(\phi _1\ne 0\), there are also some agents with a large negative fitness which implies \(\phi _2\ne 0\). The fact that \(\vec {\xi }\cdot \vec {\Omega }< 0\) thus does not necessarily imply that strategy 1 is more successful than strategy 2 as the correlation with the other frozen agents is also an important factor. For large \(\alpha \), both the mean and variance of the fitness vanish, as can be understood as a result of there being too few agents compared to the number of possible outcomes to maintain any appreciable correlation between an agents strategies and the aggregate background, \(\vec {\xi }\cdot \vec {\Omega }\approx 0\). In this limit, since the bias term always penalizes the used strategy there can be no frozen agents. We also see that both the mean and width of the distribution for given \(\alpha \) scales with \(1/\sqrt{N}\), consistent with simulations (Fig. 3).

3.2 Fraction of Frozen Agents

For each agent the score difference \(x_i\) moves with a mean step per unit time of

$$\begin{aligned} \Delta ^+= & {} \Delta _{\text {fit}}-|\bar{\Delta }_\mathrm{bias}|\text { for }x>0 \nonumber \\ \Delta ^-= & {} \Delta _{\text {fit}}+|\bar{\Delta }_\mathrm{bias}|\text { for }x<0\,, \end{aligned}$$
(20)

where \(\Delta _{\text {fit}}\) is drawn from the distribution \(\mathcal{N}(\bar{\Delta }_{\text {fit}},\sigma _\mathrm{fit})\). If the fitness is high, such that \(\Delta ^+>0\), the agent will have a net positive movement and the agent is frozen, with \(x_i>0\) and growing unbounded. The fraction of positive frozen agents is given by

$$\begin{aligned} \phi _1= & {} \int _{|\bar{\Delta }_{\text {bias}}|}^\infty dz\,\mathcal{N}_z(\bar{\Delta }_{\text {fit}},\sigma _{\text {fit}})\nonumber \\= & {} \frac{1}{2}+\frac{1}{2}\text {erf}\left[ \sqrt{\frac{\alpha }{2}}\left( \frac{\tilde{\Delta }_{\text {fit}}/\sqrt{\alpha }-|\tilde{\Delta }_{\text {bias}}|}{\tilde{\sigma }}\right) \right] \,. \end{aligned}$$
(21)

Similarly, if the fitness is relatively very poor, such that \(\Delta ^-<0\) the agent is frozen (with \(x_i<0\)) with magnitude growing unbounded. The fraction of negatively frozen agents is given by

$$\begin{aligned} \phi _2= & {} \int _{-\infty }^{-|\bar{\Delta }_{\text {bias}}|} dz\,\mathcal{N}_z(\bar{\Delta }_{\text {fit}},\sigma _{\text {fit}})\nonumber \\= & {} \frac{1}{2}-\frac{1}{2}\text {erf}\left[ \sqrt{\frac{\alpha }{2}}\left( \frac{\tilde{\Delta }_{\text {fit}}/\sqrt{\alpha }+|\tilde{\Delta }_{\text {bias}}|}{\tilde{\sigma }}\right) \right] \,, \end{aligned}$$
(22)

and correspondingly the complete fraction of frozen agents \(\phi =\phi _1+\phi _2\) and fickle agents \(\varphi =1-\phi \) are found. Since \(\tilde{\Delta }_{\text {fit}}\), \(\tilde{\Delta }_{\text {bias}}\), and \(\tilde{\sigma }\) are functions of \(\alpha \), \(\phi _1\), and \(\phi _2\), the two equations allow for solving for \(\phi _1(\alpha )\) and \(\phi _2(\alpha )\) as a function of the only parameter \(\alpha \). We find that the solutions are readily found by forward iteration, and the results are plotted and compared to direct simulations of the game in Fig. 2 Footnote 2. The fit is good, but there is no indication of a phase transition for small \(\alpha \) in this simplified model.

Fig. 2
figure 2

The fraction of frozen agents as a function of \(\alpha =P/N\) from the statistical model (Eqs. 21 and 22) compared to results from direct numerical simulations of the game. The frozen agents are divided into two groups \(\phi _1\) and \(\phi _2\) depending on if they are frozen with relative score \(x>0\) or \(x<0\) respectively. The fact that \(\phi _1>\phi _2\) follows from our convention \(\vec {\xi }_i\cdot \vec {\Omega }\le 0\) (Eq. 5). Also shown is the total fraction of frozen agents from the replica calculation for linear payoff (Eqs. 3.41–3.44 of [10]). (Each data point is averaged over 20 runs with \(\sim 1e6\) time steps each (1e5 steps for \(N=2001\)))

Fig. 3
figure 3

Distributions for mean step per unit time \(\Delta =\langle x(t+1)-x(t)\rangle \) at \(\alpha \approx 4\) for \(x>0\) (top) and \(x<0\) (bottom), comparing direct simulations of the game to the statistical model (Eq. 20). The fraction of frozen agents with \(x>0\) (\(\phi _1\)) is indicated by ”fr,+” and similarly for \(x<0\) (\(\phi _2\)). The distributions of step sizes are different for \(x>0\) and \(x<0\) because of the convention \(\vec {\xi }_i\cdot \vec {\Omega }\le 0\) as explained in Fig. 2. (Simulations averaged over 1e6 time steps, excluding a 1e4 equilibration time)

From simulations we can also measure the distribution of mean step sizes to compare to the model, which is shown in Fig. 3. There we show an intermediate value of \(\alpha \), the fit in terms of mean and width is not as good close to \(\alpha _c\) and almost perfect for large \(\alpha \), but everywhere the data seems well represented by a normal distribution. We also use the mean step size distributions from simulations to calculate the fraction of frozen agents, Fig. 2. (The naive way to distinguish between frozen and switching agents; to introduce a cut-off \(x_{\text {cut}}\) at some time t, with any agents with \(|x_t|>x_{\text {cut}}\) considered frozen, makes it difficult to distinguish between frozen and switching agents with \(\Delta \) near 0.)

4 Distributions Over x

We now use the fact that each agent is characterized by an average step size per unit time, specified by the fitness \(\Delta _{\text {fit}}\), to describe the movement of the relative score x on the set of integers. Consider that the agent at time step t has score difference x, what is the probability that at time \(t+1\) the score difference is \(x'\)? In each time step, x can only change by \(-1,0,1\) as given by the basic score update Eq. 4. We specify the respective probabilities \(p_-,p_0,p_+\) with \(p_-+p_0+p_+=1\) for \(x>0\) and \(q_-,q_0,q_+\) for \(x<0\). The mean probability that x remains unchanged is \(p_0=q_0=\frac{1}{2}\) as this corresponds to \(\xi _i^\mu =0\), meaning that the agent’s two strategies have the same bid which on average (over \(\mu \)) will be the case for half of the histories. It should also be clear that the stepping probabilities cannot depend on the magnitude of x, only the sign, because the difference in score between strategies does not enter the game, only which strategy is currently used. The case \(x=0\) has to be treated separately; we toss a coin to decide which strategy is used, thus the probability for a \(+1\) increment is \((p_++q_+)/2\) and for a \(-1\) increment is \((p_-+q_-)/2\). The movement of x thus corresponds to a one-dimensional random walk on a chain, with asymmetric jump probabilities, as sketched in Fig. 4.

Fig. 4
figure 4

The movement of the relative strategy score x of an agent is described by a random walk on a chain with jump probabilities \(p_+,p_-,p_0\) for \(x>1\) (i.e. strategy 1 in play) and \(q_+,q_-,q_0\) for \(x<-1\) (i.e. strategy 2 in play). At the boundary \(x=-1,0,1\) due to the coin toss choice of strategy the probabilities are altered as in the figure

To relate the probabilities to the mean step size we note that for \(x>0\), \(\Delta ^+=1\cdot p_++0\cdot p_0 -1\cdot p_-\), which together with the conservation of probability and the fact that \(p_0=1/2\) gives

$$\begin{aligned} p_{\pm }= & {} \frac{1}{4}\pm \frac{\Delta ^+}{2}\end{aligned}$$
(23)
$$\begin{aligned} q_{\pm }= & {} \frac{1}{4}\pm \frac{\Delta ^-}{2}\,, \end{aligned}$$
(24)

where results for q follow from the same analysis for \(x<0\). Keeping in mind that for a fickle agent \(\Delta ^+<0\) and \(\Delta ^->0\) this is of course consistent with \(p_+<p_-\) and \(q_-<q_+\). A frozen agent is instead given by \(p_+>p_-\) or \(q_->q_+\).

With the known probabilities we can write down a master equation on the chain for the probability distribution \(P_x(t)\) (implicit \(\Delta _{\text {fit}}\) dependence)

$$\begin{aligned} P_x(t+1)= & {} p_0P_x(t)+p_+P_{x-1}(t)+p_-P_{x+1}(t),\,x>1\nonumber \\ P_x(t+1)= & {} q_0P_x(t)+q_+P_{x-1}(t)+q_-P_{x+1}(t),\,x<1\,,\nonumber \\ \end{aligned}$$
(25)

and at the boundary

$$\begin{aligned} P_1(t+1)= & {} p_0P_1(t)+\frac{1}{2}(p_++q_+)P_{0}(t)+p_-P_{2}(t),\nonumber \\ P_0(t+1)= & {} \frac{1}{2}(q_0+p_0)P_0(t)+q_+P_{-1}(t)+p_-P_{1}(t),\nonumber \\ P_{-1}(t+1)= & {} q_0P_{-1}(t)+q_+P_{-2}(t)+\frac{1}{2}(p_++q_-)P_{0}(t)\,. \nonumber \\ \end{aligned}$$
(26)

Assuming that the distribution is stationary, such that \(P_x(t)=P_x\), and concentrating on \(x>0\), we find after some manipulations the equation

$$\begin{aligned} \frac{p_-}{p_+}-\frac{P_{x-1}}{P_x}=\frac{p_-}{p_+}\frac{P_{x+1}}{P_x}-1 \end{aligned}$$

which has the exponential solution

$$\begin{aligned} P_x\sim \left( \frac{p_-}{p_+}\right) ^{-x}=e^{-x\ln \frac{p_-}{p_+}}\approx e^{4x\Delta ^+}\,,x>1\,. \end{aligned}$$
(27)

In the last step we used Eq. 23 and the fact that from Eq. 17 the mean step size is small such that \(|\Delta ^+|\sim 1/\sqrt{N}\ll 1\). From this we can identify a decay length \(x_+=1/(4|\Delta ^+|)\sim \sqrt{N}\), which characterizes the range of positive excursions of the score difference of the fickle agent. Clearly, this solution requires \(p_->p_+\) (\(\Delta ^+<0\)) to be bounded, as is the case for fickle agents. From the same analysis for \(x<1\) the fickle agents with \(q_-<q_+\) have the distribution \(P_x\sim e^{x\ln \frac{q_+}{q_-}}\approx e^{4x\Delta ^-}\). What remains is to match up the solutions for positive and negative x at the interface. This can be solved exactly, but given that the exponential prefactor is small we settle for the approximate expression

$$\begin{aligned} P_x\approx & {} e^{-4|\Delta ^+|x}P_0,\,x\ge 0\nonumber \\ P_x\approx & {} e^{4\Delta ^- x}P_{0},\,x\le 0\,\nonumber \\ P_0\approx & {} \frac{4}{|\Delta ^+|^{-1}+(\Delta ^-)^{-1}}\,. \end{aligned}$$
(28)

From this expression we see that the distribution is asymmetric, such that given that on average \(|\Delta ^+|<\Delta _-\) agents are more likely to be found with \(x>0\). This opens up for a more sophisticated modelling (left for future work) where this aspect is fed back into the initial statistical description of the sum of fickle agents through the dynamical variable \(S_t\), the total attendance of the fickle agents, acquiring a mean depending on \(\mu \).

For the frozen agents the master equation is the same, but given \(p_+>p-\) (or \(q_->q_+\)) we expect a drift of the mean of the distribution. Thus focusing on long times we can consider one or the other of Eqs. 25 depending on whether the agent is frozen with \(x>0\) or \(x<0\). For \(x>0\) and assuming that the agent at time \(t=0\) is at site \(x=0\) (neglecting the influence of any excursions to \(x<0\)) we can write down an exact expression for \(P_{x}(t)\) in terms of a multinomial distribution. Alternatively, and simpler, we can take the continuum limit \(P_x(t+1)=P(x,t)+\frac{dP}{dt}\) and \(P_{x\pm 1}(t)=P(x,t)\pm \frac{dP}{dx}+\frac{1}{2}\frac{d^2P}{dx^2}\) to find the Fokker-Planck equation

$$\begin{aligned} \frac{\partial P}{\partial t}=-(p_+-p_-)\frac{\partial P}{\partial x}+\frac{1}{2}(p_++p_-) \frac{\partial ^2 P}{\partial x^2}\,. \end{aligned}$$
(29)

Given the initial condition \(P(x,0)=\delta (x)\) this has the solution \(P(x,t)=\mathcal{N}_x(\bar{x},\sigma _t)\) with \(\bar{x}=(p_+-p_-)t=\Delta ^+t\) and \(\sigma ^2_t=(p_++p_-)t=\frac{1}{2}t\), thus describing diffusion with a drift.

4.1 Full Score Distributions

Given that we now have a description of the relative score distribution of a single agent in terms of an asymmetric exponential decay or diffusion, we can also consider the full distribution of relative scores over all agents, by integrating over the distribution of mean step sizes. Defining the scaled variables \(\tilde{x}=x/\sqrt{N}\) and \(\tilde{t}=t/N\) we write \(P(\tilde{x},\tilde{t})=P_{\text {fi}}(\tilde{x})+P_{\text {fr},+}(\tilde{x},\tilde{t})+P_{\text {fr},-} (\tilde{x},\tilde{t})\), corresponding to the stationary distribution of the fickle agents and diffusive distributions of the frozen agents with \(x>0\) and \(x<0\) respectively. The first component is

$$\begin{aligned} P_{\text {fi}}(\tilde{x})=\int _{-b_\alpha }^{b_\alpha }\frac{dz\, \mathcal{N}_z\left( \frac{\tilde{\Delta }_{\text {fit}}}{\sqrt{\alpha }},\frac{\tilde{\sigma }}{\sqrt{\alpha }}\right) \,4e^{4(z\pm b_\alpha )\tilde{x}}}{(b_\alpha -z)^{-1}+{(b_\alpha +z)^{-1}}}\,, \end{aligned}$$
(30)

where ± corresponds to \(x<0\) and \(x>0\) respectively, and where \(b_\alpha =|\tilde{\Delta }_{\text {bias}}|\). For the frozen agents we have

$$\begin{aligned} P_{\text {fr},+}(\tilde{x},\tilde{t})= & {} \int _{b_\alpha }^{\infty }dz\, \mathcal{N}_z\left( \frac{\tilde{\Delta }_{\text {fit}}}{\sqrt{\alpha }},\frac{\tilde{\sigma }}{\sqrt{\alpha }}\right) \, \mathcal{N}_{\tilde{x}}(\tilde{t}(z-b_\alpha ),\sigma _{\tilde{t}})\nonumber \\ P_{\text {fr},-}(\tilde{x},\tilde{t})= & {} \int _{-\infty }^{-b_\alpha }dz\, \mathcal{N}_z\left( \frac{\tilde{\Delta }_{\text {fit}}}{\sqrt{\alpha }},\frac{\tilde{\sigma }}{\sqrt{\alpha }}\right) \, \mathcal{N}_{\tilde{x}}(\tilde{t}(z+b_\alpha ),\sigma _{\tilde{t}})\,,\nonumber \\ \end{aligned}$$
(31)

where \(\sigma ^2_{\tilde{t}}=\tilde{t}/2\). These expressions are compared to direct simulations of the game for intermediate \(\alpha \approx 4\) in Fig. 5. The simulations are averaged over a specific time window and the diffusive component Eq. 31 is integrated over the corresponding scaled time window. The agreement is excellent over the complete stationary and diffusive components of the distribution and shows the data collapse in terms of scaled coordinates. In Fig. 6 we also show a comparison for large \(\alpha \approx 80\) where the simulations have no frozen agents and all fickle agents are localized by a length close to the \(\alpha \rightarrow \infty \) value \(x_0=\sqrt{\pi N/8}\).

Fig. 5
figure 5

Full scaled distribution \(P_{\tilde{x}}\) with \(\tilde{x}=x/\sqrt{N}\) over all agents for \(\alpha \approx 4\) compiled by averaging simulations over scaled time window \(\tilde{t}_0=t_0/\sqrt{N}\) to \(\tilde{t}_1=t_1/\sqrt{N}\). The model results (”fickle+frozen”) are \(P_{\tilde{x}}=\frac{1}{\tilde{t}_1-\tilde{t}_0}\int _{\tilde{t}_0}^{\tilde{t}_1}d\tilde{t}P(\tilde{x},\tilde{t})\), using Equations 30 and 31. Also shown are model results using only fickle agents. The following time windows are used: for \(N=501\), \(t_0=5e5\) to \(t_1=5e6\); for \(N=1001\), \(t=2t_0\) to \(2t_1\); for \(N=2001\), \(t=4t_0\) to \(4t_1\), which correspond to the same \(\tilde{t}_0\) and \(\tilde{t}_1\). (Simulations are averaged over 80 runs for \(N=501\) and 15 runs for \(N=1001\) and 2001)

The asymmetry of these plots is an artefact of our gauge choice \(\vec {\xi }_i\cdot \vec {\Omega }\le 0\) which implies that on average agents will use strategy 1 (\(x>0\)) more frequently than strategy 2 (\(x<0\)). To restore the full symmetry is simply a matter of symmetrizing the distributions around \(x=0\).

Finally, we remark that the formal solution in terms of an exponential distribution of strategy scores for frozen agents was derived in [13] from a Fokker-Planck equation for the linear payoff game. See Appendix 2 and 3 for a further discussion of the comparison between the present model and the Hamiltonian formulation.

Fig. 6
figure 6

Distribution \(P_{\tilde{x}}\) at large \(\alpha \approx 80\). There are no frozen agents, and the simulated and model (“fickle”) distributions are stationary. Also shown is the asymptotic \(\alpha \rightarrow \infty \) behavior where all agents are symmetrically localized with localization length \(x_0=\sqrt{\pi N/8}\), and a simulation at \(\alpha \approx 650\) which approaches this asymptotic behavior. (Simulations averaged over \(\sim 4e8\) time steps)

5 Summary

We have studied the asymmetric phase of the basic Minority Game, focusing on the statistical distribution of relative strategy scores and the original sign-payoff formulation of the game. We formulate a statistical model for the attendance that relies on a specific gauge choice in which the two strategies of each agent are ordered with respect to the background (\(\vec {\xi }_i\cdot \vec {\Omega }\le 0\) for all agents i). Using this model we can derive a distribution of the mean step per time increment for the relative scores, specified in terms of a bias for the used strategy and the relative fitness of the two strategies. The relative strategy score for each agent is conveniently described as a random walk on an integer chain, where the jump probabilities are calculated from the mean step. The probability distribution of observing the agent at some position on the chain at a given time is either given by a static asymmetric exponential localized around \(x=0\) for fickle agents or to diffusion with a drift for frozen agents. Excellent agreement with direct simulations of the game for the score distribution confirms the basic validity of the modelling. At the same time, as discussed in the appendix, the fluctuations of the attendance are overestimated by the model. By contrasting with the Hamiltonian formulation of the dynamics the reason for this discrepancy is readily understood from viewing the model as a crude ansatz for full minimization problem. This also opens up for improving the model by introducing some variational parameters without having to confront the full complexity of the minimization of a non-quadratic Hamiltonian for general payoff functions.

We thank Erik Werner for valuable discussions. Simulations were performed on resources at Chalmers Centre for Computational Science and Engineering (C3SE) provided by the Swedish National Infrastructure for Computing (SNIC).