Can Machines Learn Whether Machines Are Learning to Collude?

  • Jonathan CaveEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11938)


Online economic interactions generate data that artificial intelligence (AI), machine learning (ML) and deep learning (DL) can use: business predictive analytics, process optimisation and market power; consumer search and choice; and government gathering evidence and regulating harmful behaviour. In algorithmic collusion (AC), revenue management algorithms implement tacitly collusive behaviour. This paper summarises theoretical and empirical evidence and considers how ML methods affect AC and whether regulators’ algorithms can help. It examines links between Internet regulation and competition policy.

Early ML literature concerned programmes ‘learning’ their environments, e.g. predicting rivals’ prices to maximise profit by estimating prices/costs, identifying strategies or influencing learning. Here, ML is AI that self-programs to optimise specific objectives (data and model ‘layers’) and DL is many-layered ML. Increased depth makes behaviour an intricate convolution of data and programme history; invisible to programmers and inexplicable to others. ML by many firms may fail to converge or have unintended consequences.

Many models use simple ML algorithms to demonstrate behaviour consistent with collusion. It is not classically collusive without communication. Populations of simple AI can learn reward/punishment strategies that sustain profitable outcomes. This paper considers further variations taking into account strategic variations, finite-memory or dominance elimination and the impact of product characteristics and search. Simulation illustrates classic inefficiencies (overshoot, convergence to supracompetitive prices, cycles and endogenous market-sharing).

It is not clear what regulators could or should ban; can they detect AC or limit its consequences? We consider: restricting information available to firms; constraining price dynamics; coding standards that incorporate regulatory compliance in ML objectives; and algorithmic detection of specified anticompetitive behaviours. For instance, likelihood-ratio policy gradient reinforcement learning algorithms are more likely to converge to collusive behaviours when they take other firms’ learning into account and able to shape others’ learning with suitable prevalence of AI and network topology.


Algorithmic collusion Machine learning Antitrust 


  1. 1.
    Cave, J., Gelsomini, L.: Single-bank proprietary platforms (2010). Accessed 02 Sept 2019
  2. 2.
    Maskin, E., Tirole, J.: A theory of dynamic oligopoly, II: price competition, kinked demand curves, and Edgeworth cycles. Econometrica J. Econm. Soc. 56, 571–599 (1988)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Katz, M.L., Shapiro, C.: Network externalities, competition, and compatibility. Am. Econ. Rev. 75(3), 424–440 (1985)Google Scholar
  4. 4.
    Galeotti, A., Goyal, S.: The law of the few. Am. Econ. Rev. 100(4), 1468–1492 (2010)CrossRefGoogle Scholar
  5. 5.
    Gittins, J.C.: Bandit processes and dynamic allocation indices. J. R. Stat. Soc. Ser. B (Methodol.) 41(2), 148–164 (1979)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Kandori, M., Mailath, G.J., Rob, R.: Learning, mutation, and long run equilibria in games. Econometrica J. Econm. Soc. 61, 29–56 (1993)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Young, H.P.: The evolution of conventions. Econometrica J. Econm. Soc. 61, 57–84 (1993)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Garcia, A.: Forecast horizon for a class of dynamic games. J. Optim. Theory Appl. 122(3), 471–486 (2004)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Amir, R.: Stochastic games in economics and related fields: an overview. In: Neyman, A., Sorin, S. (eds.) Stochastic Games and Applications. ASIC, vol. 570, pp. 455–470. Springer, Dordrecht (2003). Scholar
  10. 10.
    Silver, D., et al.: Mastering the game of Go without human knowledge. Nature 550(7676), 354 (2017)CrossRefGoogle Scholar
  11. 11.
    Tampuu, A., et al.: Multiagent cooperation and competition with deep reinforcement learning. PLoS ONE 12(4), e0172395 (2017)CrossRefGoogle Scholar
  12. 12.
    Hu, J., Wellman, M.P.: Multiagent reinforcement learning: theoretical framework and an algorithm. In: ICML, vol. 98, pp. 242–250, July 1998Google Scholar
  13. 13.
    Greenwald, A., Hall, K., Serrano, R.: Correlated Q-learning. In: ICML, vol. 3, pp. 242–249, August 2003Google Scholar
  14. 14.
    Shoham, Y., Powers, R., Grenager, T.: If multi-agent learning is the answer, what is the question? Artif. Intell. 171(7), 365–377 (2007)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Aumann, R.J.: Agreeing to disagree. Ann. Stat., 1236–1239 (1976)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Cave, J.A.: Learning to agree. Econ. Lett. 12(2), 147–152 (1983)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of WarwickCoventryUK

Personalised recommendations