1 Introduction

In this note, we study a well known result by Eshel et al. (1998). They have shown that, if agents play a bilateral prisoner’s dilemma with their interaction neighbors in a circle network and learn about actions via imitation, cooperation can survive in what is called a stochastically stable state. In this paper, we explicitly distinguish between the set of agents one interacts with (interaction neighborhood) and the set of agents one possibly imitates (information neighborhood). We do this because we want to allow for the fact that agents can be informed also about some agents beyond their interaction neighborhood (e.g. their friend’s friends) and will use this information when imitating an action.Footnote 1 We find that whenever agents are allowed to hold some information beyond their interaction neighbors, the unique stochastically stable state is one where everyone chooses defection. We then introduce a conformist bias into the imitation process relying on the model by Ellison and Fudenberg (1993). We find that, if this conformist bias is strong enough, full cooperation always emerges irrespective of whether agents hold information beyond their interaction neighbors. Conformism is thus identified as an important and new mechanism that can stabilize cooperation in a local interaction environment.

We also show that the result from Eshel et al. (1998) does not extend to general networks irrespective of whether agents hold information about others beyond their interaction neighbors. In particular, we give examples of asymmetric networks (where not all agents have the same number of neighbors) for which the unique stochastically stable state under payoff-biased imitation involves full defection. Conformism—on the other hand—stabilizes cooperation in these networks.

Previous literature has explained cooperation in networks also through other mechanisms. Marsili et al. (2005) highlight the importance of the clustering degree for sustaining cooperation. Zimmermann et al. (2004), Fosco and Mengel (2008) or Hanaki et al. (2007) among many others explain cooperation through exclusion of non-cooperators in a dynamic network setting. The role of a conformist bias in imitation has been examined by Ellison and Fudenberg (1993) to study the spread of an efficient technology in a one person decision problem. Levine and Pesendorfer (2007) explain cooperation through an imitation process in a set up where agents get some information about the opponent’s strategy prior to interaction.

The paper is organized as follows. In Section 2, the model is presented, and in Section 3, the model is analyzed. Section 4 concludes. The proofs are relegated to an Appendix.

2 The model

2.1 The local interaction game

There are i = 1,...n agents interacting in a 2×2 prisoner’s dilemma game through a circle network. Interactions are not necessarily restricted to an agents first order neighbors. For any number h ∈ ℕ +  we denote \(N_{i}^{h}\) the set of agents within a radius h of “geodesic” distance to agent i. Denote \( N_{i}^{Z}\) the set of agents with whom agent i interacts or the “interaction neighborhood” of player i. Furthermore the set of agents with whom i interacts (\(N_{i}^{Z}\)) will in general not equal the set of agents about whom i has information. Denote the latter set—the information neighborhood of agent i—by \(N_{i}^{I}\).Footnote 2 We will assume I ≥ Z. Let it be a convention that \(N_{i}^{Z}\) does not contain the player i herself while \( N_{i}^{I}\) does. As an illustration consider the circle with interaction radius Z = 1 and information radius I = 2 depicted below.

$$....\overset{N_{i}^{I}}{\overbrace{\left( i-2\right) -\underset{N_{i}^{Z}}{ \underbrace{(i-1)}}-i-\underset{N_{i}^{Z}}{\underbrace{(i+1)}}-(i+2)}} -\left( i+3\right) ... $$

Individuals play a 2×2 prisoner’s dilemma with their interaction neighbors \(N_{i}^{Z}\). The set of actions is given by A = {C,D} for all players. For each pair of actions a i ,a j  ∈ A the payoff π i (a i ,a j ) that player i earns when playing action a i against an opponent who plays a j is given by the following matrix,

$$\begin{tabular}{|l|l|l|} \hline $a_{i}\backslash a_{j}$ & $C$ & $D$ \\ \hline $C$ & $\alpha $ & $\beta $ \\ \hline $D$ & $\gamma $ & $\delta $ \\ \hline \end{tabular} _{.}$$
(1)

We are interested in the case γ > α > δ > β > 0 i.e. the case where matrix (1) represents a Prisoner’s dilemma. Assume also that \(\alpha >\frac{\beta +\gamma }{2},\) i.e. that cooperation (C) is efficient.Footnote 3 The payoffs at time t for player i from playing action a i are given byFootnote 4

$$\Pi _{i}^{t}(a_{i}^{t},a_{j}^{t})=\sum\limits_{j\in N_{i}^{Z}}\pi _{i}\left(a_{i}^{t},a_{j}^{t}\right).$$
(2)

Denote \(\Pi ^{t}(N_{i}^{I},a)=\frac{\sum_{k\in N_{i}^{I}|a_{k}^{t}=a}\Pi _{k}^{t}(\cdot )}{{\rm card}\{k\in N_{i}^{I}|a_{k}^{t}=a\}}\) the average payoff of all agents in \(N_{i}^{I}\) that choose action a and let \(\Pi ^{t}(N_{i}^{I},a)=0\) if \({\rm card}\{k\in N_{i}^{I}|a_{k}^{t}=a\}=0.\)

2.2 Learning

At each point in time t = 1,2,3.... the state of the system is given by the action choices of all agents \(s(t)=\left( a_{i}^{t}\right)_{i=1}^{n}4\). Denote S the state space. At each point in time a (small) number r of agents is randomly selected to revise their action choices. We consider two possible decision rules. First we consider the rule typically used in the literature where agents rely on payoff-biased imitation. Then we add a conformist-bias into the imitation process.

2.2.1 Payoff-biased imitation

Under the basic process an agent (who is selected to revise her action choice) compares the average payoff in her information neighborhood of the action she is currently not choosing \(\lnot a_{i}\) and her action a i . If and only if

$$\Pi ^{t-1}(N_{i}^{I},\lnot a_{i})-\Pi ^{t-1}(N_{i}^{I},a_{i})>0$$
(3)

she changes her action. With small probability ε she trembles and reverses her choice. This is the rule used, for example, by Eshel et al. (1998).

2.2.2 Payoff- and conformist-biased imitation

The process with conformism takes into account the possibility that agents might be more inclined to make more “popular” choices. Decision rule (3) is substituted by the following rule,

$$\Pi ^{t-1}(N_{i}^{I},\lnot a_{i})-\Pi ^{t-1}(N_{i}^{I},a_{i})>m(1-2x_{\lnot a_{i}}).$$
(4)

m ∈ ℝ +  is a finite conformity parameter and \(x_{\lnot a_{i}}\) the share of all agents that i knows about that use a different action then herself, i.e. \(x_{\lnot a_{i}}=(2I+1)^{-1}{\rm card}\{j\in N_{i}^{I}|a_{j}\neq a_{i}\}.\) Obviously m = 0 corresponds to the basic process. If both actions are equally popular, i.e. if \(x_{\lnot a_{i}}=1/2,\) the agent is not biased towards using either of them. If one of the actions is more popular, on the other hand, the agent will be ceteris paribus more inclined to use that action. Decision-rule (4) is essentially the rule used in Ellison and Fudenberg (1993).Footnote 5

2.3 Techniques used in the analysis

The learning process described in Section 2.2 (under either decision rule) gives rise to a finite Markov chain, for which the standard techniques apply. Denote P ε(s,s ) the transition probability for a transition from state s to s and P 0(s,s ) the transition probability if ε = 0. An absorbing set under P 0 is a minimal subset of states which, once entered, is never left. An absorbing state is a singleton absorbing set, or in other words,

Definition 1

State s is absorbing \(\Leftrightarrow P^{0}(s,s)=1\).

As trembles make transitions between any two states possible, the perturbed Markov process has a unique stationary distribution, denoted μ ε.Footnote 6 The limit invariant distribution \(\mu ^{\ast }=\lim_{\varepsilon \rightarrow 0}\mu ^{\varepsilon }\) exists and its support \(\{s\in S|\lim_{\varepsilon \rightarrow 0}\mu ^{\varepsilon }(s)>0\}\) is a union of some absorbing sets of the unperturbed process. The limit invariant distribution singles out a stable prediction of the unperturbed dynamics (ε = 0) in the sense that for any ε > 0 small enough the play approximates that described by μ  ∗  in the long run. The states in the support of μ  ∗  are called stochastically stable states. These are the states we focus on.

Definition 2

State s is stochastically stable \(\Leftrightarrow \mu ^{\ast }(s)>0\).

Denote ω the union of one or more absorbing sets and Ω the set of all absorbing sets. Define X(ω,ω ) as the minimal number of trembles necessary to reach ω from ω.Footnote 7 The stochastic potential ψ(s) of a state s ∈ Ω is defined as the sum of minimal trembles necessary to induce a (possibly indirect) transition to s from any alternative state s  ∈ Ω, i.e. \(\psi (s)=\sum_{s^{\prime }\in \Omega }X(s^{\prime },s)\).

Result

(Young 1993) State s  ∗  is stochastically stable if it has minimal stochastic potential, i.e. if \(s^{\ast }\in \arg \min_{s\in \Omega }\psi (s)\).

The intuition behind Young’s result is simple. In the long run, the process will spend most of the time in one of its absorbing states. The stochastic potential of any state s is a measure of how easy it is to jump from the basin of attraction of other absorbing states to that of s by perturbing the process a little.

3 Analysis

Throughout the analysis, we assume that I is small relative to the number of players n. In particular, we will assume that \(I<\frac{n-2}{4}\), ensuring that, for any agent, at least one other agent can be found such that their information neighborhoods are disjoint.Footnote 8 Let us briefly characterize absorbing states.

Absorbing states

States where a i  = a, ∀ i ∈ G are absorbing. Furthermore there exists \(\overline{\alpha} (m,Z,I)>0\) such that a set of polymorphic states is absorbing whenever \(\alpha >\overline{\alpha }(\cdot)\) . In all such states, there are strings of cooperators separated by strings of defectors.

Proof

Appendix. □

The exact composition of the set of polymorphic absorbing states depends on the coefficient for conformism m as well as the information radius I and the interaction radius Z. In the following, we will denote s a the state where all agents play action a ∈ {C,D} and ω CD the set of polymorphic absorbing states.

3.1 Payoff-biased imitation in the circle network

Start with a situation where I > Z, i.e. where agents hold some information about others beyond their interaction neighborhood. Agents can obtain such information for example if their friends tell them about their friends etc. We will show that if agents rely on payoff-biased imitation only the unique outcome in these situations is full defection. As an illustration consider the network depicted in Fig. 1 where I = 2 and Z = 1. Then from the fully cooperative state s C one tremble by any player can induce a transition to state s D with full defection. To see this assume that starting from s C player 2 trembles and switches to action D. Now player 4 will want to imitate player 2 as the average defector payoff in his information neighborhood \(\Pi ^{t}(N_{4}^{I}(D))=2\gamma \) exceeds the average cooperator payoff \(\Pi ^{t}(N_{4}^{I}(C))\) (Fig. 1).

Fig. 1
figure 1

Circle network with I = 2 and Z = 1. Player 2 is a defector

Consider next player 6 and note that \(N_{6}^{I}=\{4,5,6,7,8\}\). Consequently \(\Pi ^{t}(N_{6}^{I}(D))=2\gamma\, >\, \Pi ^{t}(N_{6}^{I}(C))\) and player 6 will switch to defection. If next player 8, then player 10,12 etc... switch to defection, then all remaining cooperating players will be surrounded by defectors. Consequently \(\Pi ^{t}(N_{i}^{I}(D))>2\beta = \Pi ^{t}(N_{i}^{I}(C)),\forall i\in G\) and the remaining cooperators will also want to switch to defection. We end up in s D. Such a transition after only one tremble is possible because of the fact that I > Z. This allows defection to spread “across long distances”, ensuring that cooperators always interact with more defectors than the defectors themselves during the transition. On the other hand, it is clear that, for a transition from s D to a state characterized by some cooperation, more than one tremble is needed as a single cooperator surrounded by defectors will have the minimum possible payoff and will never be imitated. This underlies the fact that the unique stochastically stable state is s D (Fig. 2).

Fig. 2
figure 2

Circle network with I = 2 and Z = 1. Player 6 imitates player 4

Proposition 1

If I > Z the unique stochastically stable state is s D.

Proof

Appendix. □

Next we consider the case I = Z, previously examined by Eshel et al. (1998). This case reflects situations where agents’ information is restricted to their interaction partners. Examples for such situations will be found in anonymous interactions, like e.g. the interaction between buyers and sellers in a supply chain. Now a transition from s C to s D is not always possible after one action tremble. As an illustration consider the network depicted in Fig. 3 where I = Z = 1. Let player 2 tremble to action D. Her action will be imitated by one of her interaction partners as \(N_{2}^{I}=\{1,2,3\}=N_{2}^{Z}\) (Fig. 3).

Fig. 3
figure 3

Circle network with I = Z = 1. Player 2 chooses defection

Say player 1 switches to D. As \(N_{1}^{I}\cup N_{2}^{I}=\{n,1,2,3\},\) the only players who might adopt the deficient action now are players n and 3 (Fig. 4). The average payoff of cooperating agents in these information neighborhoods is given by \(\Pi ^{t}(N_{n}^{I}(C))=\Pi ^{t}(N_{3}^{I}(C))= \frac{3\alpha +\beta }{2}\). The average payoff of defectors is given by \(\Pi ^{t}(N_{n}^{I}(D))=\Pi ^{t}(N_{3}^{I}(D))=\gamma +\delta\). Defection will spread if and only if \(\alpha <\frac{2(\gamma +\delta )-\beta }{3}\). In this case, a transition to s D can be induced via one tremble.

Fig. 4
figure 4

Circle network with I = Z = 1. Player 3 imitates player 2

What happens with the reverse transition from s D to s C? Under some conditions on the payoffs two simultaneous trembles will suffice for the cooperative action to spread through the graph until two defectors are left. These defectors, though, will never want to imitate the cooperative action and we end up in a polymorphic stochastically stable state. More precisely,

Proposition 2

If I = Z, there exists \(\overline{\alpha}(I,Z)>0\) such that, if the game payoffs satisfy \(\alpha \geq \overline{\alpha }(\cdot),\) state s is stochastically stable \(\Rightarrow s\in \omega ^{CD}\) . If \(\alpha <\overline{\alpha }(\cdot)\) the unique stochastically stable state is s D.

Proof

Appendix. □

Less information does actually help cooperation. The reason is that defection now can only spread locally, forcing defectors to interact with each other. This reduces the average payoff of defectors revealing the social benefit of cooperation. Proposition 2 essentially generalizes the result from Eshel et al. (1998).Footnote 9

3.2 Payoff-biased and conformist-biased imitation in the circle network

In this section, we assume that agents display a conformist bias, i.e. that they are more inclined to imitate more popular actions. We show that, if imitation is payoff-biased and (sufficiently) conformist-biased the unique stochastically stable state involves full cooperation. Consider Fig. 5 where I = 2 and Z = 1. Now starting from s C assume that player 3 trembles and switches to D. Player 1 will now imitate player 3 if and only if the payoff advantage of defection is high enough to make up for the “unpopularity” of this action.

Fig. 5
figure 5

Circle network with conformist bias

It is shown in the Appendix that a necessary condition for action D to spread through the whole graph after just one tremble is \(m<\frac{ (2I+1)[2IZ(\gamma -\alpha )+Z(\alpha -\beta )]}{(2I-1)I}\). What happens if agents display a larger degree of conformism? Then (at least) one more tremble in agent 1’s information neighborhood is needed for him to switch. Assume player 1 is willing to imitate D after both players 3 and n − 1 have switched to D. Can the defection spread beyond \(N_{1}^{I}\)? Consider the decision of player 4. His information neighborhood is given by \(N_{n-2}^{I}=\{2,3,4,5,6\}\). Players 4,5 and 6 are cooperators. As one defector is not enough to induce imitation, both 2 and 3 have to play defect for defection to spread through the operation of the unperturbed dynamics alone. But then we have a string of interacting defectors ...1 − 2 − 3 − 4.... A small amount of conformism can thus be enough to force transitions in which defectors interact mainly among each other. This reduces the payoff advantage of defectors compared to cooperators, revealing the social benefit of cooperation.

Let us consider now the reverse transition from s D to s C. As always, transitions after one tremble are not possible as single cooperators will never be imitated. Depending on the degree of conformism more or less simultaneous trembles are needed in a given information neighborhood to induce a transition. Note though that there is a positive feedback effect, as more trembles of connected cooperators increase the payoff advantage of cooperation over defection. This in turn reduces the need for cooperation to be “popular” in order to spread. Higher degrees of conformism thus favor cooperative outcomes. We can state the following proposition.

Proposition 3

Assume I > Z and that agents display a conformist bias. There exist \(\underline{m}(Z,I)>0\) and \(\overline{m}(Z,I)>0\) s.th. if \(m\geq \overline{m}(\cdot)\) the unique stochastically stable state is s C . If \(m\leq \underline{m}(\cdot)\) Proposition 2 applies. Furthermore, \(\overline{m}(\cdot)\) and \(\underline{m}(\cdot)\) are strictly decreasing in I and increasing in Z.

Proof

Appendix. □

Conformism can stabilize cooperation. Furthermore if imitation is conformist-biased more information (larger I) actually helps cooperation. The intuition is as follows. Conformism is helpful to sustain cooperation because it requires the formation of strings of cooperators or defectors during any transition. But then—given that these strings exist—more information is helpful to achieve cooperation because it enables agents to “look deeper” into the strings. This increases the number of cooperators interacting with cooperators and of defectors interacting with defectors in any agents sample and makes more evident the higher payoff that cooperation yields to a community.

The case I = Z confirms the results from the case where I > Z. Again conformism is helpful to sustain cooperation as the following proposition shows.

Proposition 4

Assume I = Z and that agents display a conformist bias. There exist \(\underline{m}(Z)>0\) and \(\overline{m}(Z)>0\) s.th. if \(m\geq \overline{m} (\cdot)\) the unique stochastically stable state is s C . If \(m\leq \underline{m}(\cdot)\) Proposition 3 applies.

Proof

Appendix. □

3.3 Other networks

In this section, we would like to point to another dimension in which the Eshel et al. (1998) result is not robust, but where imitation with a conformist bias yields cooperative outcomes in the long run. In particular we want to discuss two asymmetric networks (where not all players have the same number of nodes) and show that, while with decision rule (3) stochastically stable outcomes yield defection, cooperation is obtained with a conformist bias.

Consider first the interconnected star network depicted in Fig. 6.Footnote 10

Fig. 6
figure 6

Interconnected star network

In this network there are two types of agents—some “centers” with a “high” degree k (like agent 1 and 2) and some “spokes” with degree one. The stars are interlinked as the centers are linked to k  < < k other centers. Assume I = Z as in the original model from Eshel et al. (1998). A transition from any state to s D can occur via one tremble by one of the centers, infecting first all the centers and then the “spokes”. As the reverse transitions always need more than one tremble, with decision rule (3) the unique stochastically stable state will be s D. What happens under decision rule (4)? If the conformist bias is strong enough agents in the periphery will always conform to what the center does (their unique neighbor). A single action tremble by the center can then infect any star. Now if I > Z, a small number (how small depending on m,I,Z) of infected stars suffices to induce a transition to s C, as agents in the cooperative stars will earn higher profits than agents in defective stars. For transitions to s D, though, all centers have to tremble simultaneously if the conformist bias is strong enough.

Proposition 5a

If I ≥ Z, the unique stochastically stable state in the interconnected star is given by s D under decision rule (3). There exists \(\overline{m}(Z,I)\) s.t. whenever \(m>\overline{m}(\cdot)\) the unique stable state is s C , if I > Z. If I = Z both states s C and s D are stable.

Proof

Appendix. □

Of course the interconnected star is a network with extreme asymmetries in degree and maybe cooperation obtains even under Eq. 3 as long as the asymmetry is not too extreme. Consider thus the crystal network depicted in Fig. 7. In this network there is an agent i with degree d (in Fig. 7 d = 6) whose first-order neighbors have degree d − 1, whose second-order neighbors have degree d − 2 and so on until some minimal degree. Again for this network the unique stochastically stable state (with I = Z) is one where everyone chooses defection. Defection can spread after a tremble by player i, infecting one player \(j\in N_{i}^{1}\), then one player \(k\in N_{j}^{1}\cap \backslash N_{i}^{1}\) and so on.Footnote 11 Again decision rule (4) in this example leads to cooperation (whenever m is “large enough”), because conformism forces actions to spread locally thereby revealing the benefit of cooperation.

Fig. 7
figure 7

Crystal network

Proposition 5b

Assume I ≥ Z. The unique stochastically stable state in the crystal network (C) is given by s D under decision rule (3) and by s C under rule (4), whenever \(m>\overline{m}(I,Z)\) for some finite \(\overline{m}(I,Z)>0\).

Proof

Appendix. □

4 Conclusions

We have studied a model where agents interact in a prisoner’s dilemma through a local interaction structure. Agents learn about optimal actions through imitation. The set of agents they possibly imitate (their information neighborhood) can differ from the set of agents they interact with (their interaction neighborhood). If agents rely on payoff biased imitation alone, choosing the action (cooperation or defection) that has yielded the higher payoff in the previous period, we find the following results.

  • If the information radius of agents exceeds their interaction radius the unique stochastically stable outcome is full defection. Only if information radius and interaction radius are the same can some cooperation be obtained in a stochastically stable state. In this sense more information hurts cooperation.

We then introduce a conformist bias into imitation, assuming that agents are more likely to adopt more “popular” actions and find the following.

  • If the conformist bias is large enough all stochastically stable outcomes involve cooperation.

  • If there is a conformist bias more information helps cooperation.

The intuition is as follows. Because joint cooperation is beneficial for a community but not individually optimal, strings of cooperators are better off than strings of defectors, whereas single defectors are always better off than single cooperators. But then a larger information radius hurts in the standard case because it allows imitation across “long distances”. This works against the formation of strings. Intuitively what conformism does is that it forces actions to spread “locally” thereby revealing the benefit of cooperation. A larger information radius helps in this case because it allows agents to “look deeper” into strings of cooperators and defectors.