Multi-Agent Reinforcement Learning Tool for Job Shop Scheduling Problems

  • Yailen Martínez JiménezEmail author
  • Jessica Coto Palacio
  • Ann Nowé
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1173)


The emergence of Industry 4.0 allows for new approaches to solve industrial problems such as the Job Shop Scheduling Problem. It has been demonstrated that Multi-Agent Reinforcement Learning approaches are highly promising to handle complex scheduling scenarios. In this work we propose a user friendly Multi-Agent Reinforcement Learning tool, more appealing for industry. It allows the users to interact with the learning algorithms in such a way that all the constraints in the production floor are carefully included and the objectives can be adapted to real world scenarios. The user can either keep the best schedule obtained by a Q-Learning algorithm or adjust it by fixing some operations in order to meet certain constraints, then the tool will optimize the modified solution respecting the user preferences using two possible alternatives. These alternatives are validated using OR-Library benchmarks, the experiments show that the modified Q-Learning algorithm is able to obtain the best results.


Reinforcement Learning Multi-Agent Systems Industry 4.0 Job Shop Scheduling 


  1. 1.
    Asadzadeh, L.: A local search genetic algorithm for the job shop scheduling problem with intelligent agents. Comput. Ind. Eng. 85, 376–383 (2015)CrossRefGoogle Scholar
  2. 2.
    Aydin, M.E., Oztemel, E.: Dynamic job-shop scheduling using reinforcement learning agents. Robot. Auton. Syst. 33, 169–178 (2000)CrossRefGoogle Scholar
  3. 3.
    Baxter, J., Bartlett, P.L.: Infinite-horizon policy-gradient estimation. J. Artif. Intell. Res. 15, 319–350 (2001)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Beasley, J.E.: OR-Library: distributing test problems by electronic mail. J. Oper. Res. Soc. 41(11), 1069–1072 (1990)CrossRefGoogle Scholar
  5. 5.
    Gabel, T.: Multi-agent reinforcement learning approaches for distributed job-shop scheduling problems. Ph.D. thesis, Universität Osnabrück (2009)Google Scholar
  6. 6.
    Gabel, T., Riedmiller, M.: On a successful application of multi-agent reinforcement learning to operations research benchmarks. In: IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, Honolulu, pp. 68–75 (2007)Google Scholar
  7. 7.
    Gavin, R., Niranjan, M.: On-line Q-learning using connectionist systems. Technical report, Engineering Department, Cambridge University (1994)Google Scholar
  8. 8.
    Gomes, C.P.: Artificial intelligence and operations research: challenges and opportunities in planning and scheduling. Knowl. Eng. Rev. 15(1), 1–10 (2000)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Goren, S., Sabuncuoglu, I.: Robustness and stability measures for scheduling: single-machine environment. IIE Trans. 40(1), 66–83 (2008)CrossRefGoogle Scholar
  10. 10.
    Hall, N., Potts, C.: Rescheduling for new orders. Oper. Res. 52, 440–453 (2004)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Leitao, P., Colombo, A., Karnouskos, S.: Industrial automation based on cyber-physical systems technologies: prototype implementations and challenges. Comput. Ind. 81, 11–25 (2016)CrossRefGoogle Scholar
  12. 12.
    Leitao, P., Rodrigues, N., Barbosa, J., Turrin, C., Pagani, A.: Intelligent products: the grace experience. Control Eng. Pract. 42, 95–105 (2005)CrossRefGoogle Scholar
  13. 13.
    Leusin, M.E., Frazzon, E.M., Uriona Maldonado, M., Kück, M., Freitag, M.: Solving the job-shop scheduling problem in the industry 4.0 era. Technologies 6(4), 107 (2018)CrossRefGoogle Scholar
  14. 14.
    Martínez Jiménez, Y.: A generic multi-agent reinforcement learning approach for scheduling problems. Ph.D. thesis, Vrije Universiteit Brussel, Brussels (2012)Google Scholar
  15. 15.
    Pinedo, M.: Scheduling: Theory, Algorithms and Systems. PrenticeHall, Englewood cliffs (1995)zbMATHGoogle Scholar
  16. 16.
    Singh, S., Sutton, R.S.: Reinforcement learning with replacing eligibility traces. Mach. Learn. 22, 123–158 (1996)zbMATHGoogle Scholar
  17. 17.
    Stone, P., Veloso, M.: Multiagent systems: a survey from a machine learning perspective. Auton. Robot. 8(3), 345–383 (2000)CrossRefGoogle Scholar
  18. 18.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  19. 19.
    Toader, F.A.: Production scheduling in flexible manufacturing systems: a state of the art survey. J. Electr. Eng. Electron. Control Comput. Sci. 3(7), 1–6 (2017)Google Scholar
  20. 20.
    Urlings, T.: Heuristics and metaheuristics for heavily constrained hybrid flowshop problems. Ph.D. thesis (2010)Google Scholar
  21. 21.
    Vogel-Heuser, B., Lee, J., Leitao, P.: Agents enabling cyber-physical production systems. AT-Autom. 63, 777–789 (2015)Google Scholar
  22. 22.
    Watkins, C.J.C.H.: Learning from delayed rewards. Ph.D. thesis, King’s College (1989)Google Scholar
  23. 23.
    Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 229–256 (1992)zbMATHGoogle Scholar
  24. 24.
    Xiang, W., Lee, H.: Ant colony intelligence in multi-agent dynamic manufacturing scheduling. Eng. Appl. Artif. Intell. 21, 73–85 (2008)CrossRefGoogle Scholar
  25. 25.
    Ng, A.Y., Jordan, M.: PEGASUS: a policy search method for large MDPs and POMDPs. In: Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence (2000)Google Scholar
  26. 26.
    Zhang, W.: Reinforcement learning for job shop scheduling. Ph.D. thesis, Oregon State University (1996)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Universidad Central “Marta Abreu” de Las VillasSanta ClaraCuba
  2. 2.UEB Hotel Los CaneyesSanta ClaraCuba
  3. 3.Vrije Universiteit BrusselBrusselsBelgium

Personalised recommendations