Skip to main content

Reinforcement Learning Methods for Operations Research Applications: The Order Release Problem

  • Conference paper
  • First Online:
Machine Learning, Optimization, and Data Science (LOD 2018)

Abstract

An important goal in Manufacturing Planning and Control systems is to achieve short and predictable flow times, especially where high flexibility in meeting customer demand is required. Besides achieving short flow times, one should also maintain high output and due-date performance. One approach to address this problem is the use of an order release mechanism which collects all incoming orders in an order-pool and thereafter determines when to release the orders to the shop-floor. A major disadvantage of traditional order release mechanisms is their inability to consider the nonlinear relationship between resource utilization and flow times which is well known from practice and queuing theory. Therefore, we propose a novel adaptive order release mechanism which utilizes deep reinforcement learning to set release times of the orders and provide several techniques for challenging operations research problems with reinforcement learning. We use a simulation model of a two-stage flow-shop and show that our approach outperforms well-known order release mechanism.

This research is partly supported by the Aktion D. Swarovski KG (2016) grant.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    ReLU stands for rectified linear unit.

References

  1. Ackerman, S.: Even-flow a scheduling method for reducing lateness in job shops. Manag. Technol. 3, 20–32 (1963)

    Google Scholar 

  2. Akyol, D.E., Bayhan, G.M.: A review on evolution of production scheduling with neural networks. Comput. Ind. Eng. 53(1), 95–122 (2007). http://www.sciencedirect.com/science/article/pii/S0360835207000666

    Article  Google Scholar 

  3. Aytug, H., Bhattacharyya, S., Koehler, G.J., Snowdon, J.L.: A review of machine learning in scheduling. IEEE Trans. Eng. Manag. 41, 165–171 (1994)

    Article  Google Scholar 

  4. Baykasoglu, A., Gocken, M.: A simulation based approach to analyse the effects of job release on the performance of a multi-stage job-shop with processing flexibility. Int. J. Prod. Res. 49(2), 585–610 (2011). <GotoISI>://WOS:000284413100015

    Google Scholar 

  5. Bechte, W.: Theory and practice of load-oriented manufacturing control. Int. J. Prod. Res. 26(3), 375–395 (1988)

    Article  Google Scholar 

  6. Bechte, W.: Load-oriented manufacturing control just-in-time production for job shops. Prod. Plan. Control 5(3), 292–307 (1994)

    Article  Google Scholar 

  7. Bertrand, J.W.M., Wortmann, J.C.: Production Control and Information Systems for Component Manufacturing Shops. Elsevier Science Inc., New York (1981)

    Google Scholar 

  8. Bertrand, J., Wortmann, J., Wijngaard, J.: Production Control: A Structural and Design Oriented Approach. Elsevier, Amsterdam (1990)

    Google Scholar 

  9. Conover, W.: Practical Nonparametric Statistics. Wiley Series in Probability and Statistics, 3rd edn. Wiley, New York (1999)

    Google Scholar 

  10. Enns, S., Suwanruji, P.: Work load responsive adjustment of planned lead times. J. Manuf. Technol. Manag. 15(1), 90–100 (2004)

    Article  Google Scholar 

  11. Gelders, L., Van Wassenhove, L.N.: Hierarchical integration in production planning: theory and practice. J. Oper. Manag. 3(1), 27–35 (1982)

    Article  Google Scholar 

  12. Hendry, L., Kingsman, B.: Production planning systems and their applicability to make-to-order companies. Eur. J. Oper. Res. 40(1), 1–15 (1989). http://www.sciencedirect.com/science/article/pii/037722178990266X

    Article  MathSciNet  Google Scholar 

  13. Hendry, L., Kingsman, B.: A decision support system for job release in make-to-order companies. Int. J. Ope. Prod. Manag. 11(6), 6–16 (1991)

    Article  Google Scholar 

  14. Hoyt, J.: Dynamic lead times that fit today’s dynamic planning (quoat lead times). Prod. Inventory Manag. 19(1), 63–71 (1978)

    Google Scholar 

  15. Hsu, S.Y., Sha, D.Y.: Due date assignment using artificial neural networks under different shop floor control strategies. Int. J. Prod. Res. 42(9), 1727–1745 (2004). https://doi.org/10.1080/00207540310001624375

    Article  MATH  Google Scholar 

  16. Karaoglan, A.D., Karademir, O.: Flow time and product cost estimation by using an artificial neural network (ANN): a case study for transformer orders. Eng. Econ. 62(3), 272–292 (2017). https://doi.org/10.1080/0013791X.2016.1185808

    Article  Google Scholar 

  17. Knollmann, M., Windt, K.: Control-theoretic analysis of the lead time syndrome and its impact on the logistic target achievement. Procedia CIRP 7, 97–102 (2013)

    Article  Google Scholar 

  18. Law, A.M., Kelton, W.D.: Simulation Modeling & Analysis, 3rd edn. McGraw-Hill Inc., New York (2000)

    MATH  Google Scholar 

  19. Lee, C.Y., Piramuthu, S., Tsai, Y.K.: Job shop scheduling with a genetic algorithm and machine learning. Int. J. Prod. Res. 35(4), 1171–1191 (1997). https://doi.org/10.1080/002075497195605

    Article  MATH  Google Scholar 

  20. Li, S., Li, Y., Liu, Y., Xu, Y.: A GA-based NN approach for makespan estimation. Appl. Math. Comput. 185(2), 1003–1014 (2007). Special Issue on Intelligent Computing Theory and Methodology. http://www.sciencedirect.com/science/article/pii/S0096300306008253

  21. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)

  22. Lin, L.J.: Reinforcement learning for robots using neural networks. Technical report, School of Computer Science, Carnegie-Mellon University, Pittsburgh, PA (1993)

    Google Scholar 

  23. Mahadevan, S.: Average reward reinforcement learning: foundations, algorithms, and empirical results. Mach. Learn. 22(1), 159–195 (1996). https://doi.org/10.1007/BF00114727

    Article  MATH  Google Scholar 

  24. Mather, H., Plossl, G.W.: Priority fixation versus throughput planning. Prod. Inventory Manag. 19, 27–51 (1978)

    Google Scholar 

  25. Melnyk, S.A., Ragatz, G.L.: Order review release - research issues and perspectives. Int. J. Prod. Res. 27(7), 1081–1096 (1989). <GotoISI>://WOS:A1989AC60400003

    Google Scholar 

  26. Metan, G., Sabuncuoglu, I., Pierreval, H.: Real time selection of scheduling rules and knowledge extraction via dynamically controlled data mining. Int. J. Prod. Res. 48(23), 6909–6938 (2010). https://doi.org/10.1080/00207540903307581

    Article  Google Scholar 

  27. Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)

    Google Scholar 

  28. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  29. Molinder, A.: Joint optimization of lot-sizes, safety stocks and safety lead times in a MRP system. Int. J. Prod. Res. 35(4), 983–994 (1997)

    Article  Google Scholar 

  30. Pahl, J., Voß, S., Woodruff, D.L.: Production planning with load dependent lead times: an update of research. Ann. Oper. Res. 153(1), 297–345 (2007). https://doi.org/10.1007/s10479-007-0173-5

    Article  MathSciNet  MATH  Google Scholar 

  31. Paternina-Arboleda, C.D., Das, T.K.: Intelligent dynamic control policies for serial production lines. IIE Trans. 33(1), 65–77 (2001). https://doi.org/10.1080/07408170108936807

    Article  Google Scholar 

  32. Patil, R.: Using ensemble and metaheuristics learning principles with artificial neural networks to improve due date prediction performance. Int. J. Prod. Res. 46(21), 6009–6027 (2008)

    Article  Google Scholar 

  33. Philipoom, P.R., Rees, L.P., Wiegmann, L.: Using neural networks to determine internally-set due-date assignments for shop scheduling. Decis. Sci. 25(5–6), 825–851 (1994). http://dx.doi.org/10.1111/j.1540-5915.1994.tb01871.x

    Article  Google Scholar 

  34. Raaymakers, W., Weijters, A.: Makespan estimation in batch process industries: a comparison between regression analysis and neural networks. Eur. J. Oper. Res. 145(1), 14–30 (2003). http://www.sciencedirect.com/science/article/pii/S037722170200173X

    Article  Google Scholar 

  35. Savell, D.V., Perez, R.A., Koh, S.W.: Scheduling semiconductor wafer production: an expert system implementation. IEEE Expert 4(3), 9–15 (1989). (Fall)

    Article  Google Scholar 

  36. Schneeweiss, C.: Distributed decision making–a unified approach. Eur. J. Oper. Res. 150(2), 237–252 (2003)

    Article  MathSciNet  Google Scholar 

  37. Selcuk, B., Fransoo, J.C., De Kok, A.: The effect of updating lead times on the performance of hierarchical planning systems. Int. J. Prod. Econ. 104(2), 427–440 (2006)

    Article  Google Scholar 

  38. Silver, D., et al.: Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815 (2017)

  39. Spearman, M.L., Woodruff, D.L., Hopp, W.J.: CONWIP: a pull alternative to Kanban. Int. J. Prod. Res. 28(5), 879–894 (1990). http://www.tandfonline.com/doi/abs/10.1080/00207549008942761

    Article  Google Scholar 

  40. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT Press, Cambridge (1998)

    Google Scholar 

  41. Tatsiopoulos, I., Kingsman, B.: Lead time management. Eur. J. Oper. Res. 14(4), 351–358 (1983)

    Article  Google Scholar 

  42. Teo, C.C., Bhatnagar, R., Graves, S.C.: An application of master schedule smoothing and planned lead time control. Prod. Oper. Manag. 21(2), 211–223 (2012)

    Article  Google Scholar 

  43. Thuerer, M., Stevenson, M., Silva, C.: Three decades of workload control research: a systematic review of the literature. Int. J. Prod. Res. 49(23), 6905–6935 (2011)

    Article  Google Scholar 

  44. Thuerer, M., Stevenson, M., Silva, C., Land, M.J., Fredendall, L.D.: Workload control and order release: a lean solution for make-to-order companies. Prod. Oper. Manag. 21(5), 939–953 (2012)

    Article  Google Scholar 

  45. Tsitsiklis, J.N., Van Roy, B.: Analysis of temporal-difference learning with function approximation. In: Advances in Neural Information Processing Systems, pp. 1075–1081 (1997)

    Google Scholar 

  46. Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3), 279–292 (1992). https://doi.org/10.1007/BF00992698

    Article  MATH  Google Scholar 

  47. Wiendahl, H.: Load-Oriented Manufacturing Control, 1st edn. Springer, Berlin (1995). https://doi.org/10.1007/978-3-642-57743-7. http://books.google.at/books-id=e66fmQEACAAJ

    Book  Google Scholar 

  48. Wuest, T., Weimer, D., Irgens, C., Thoben, K.D.: Machine learning in manufacturing: advantages, challenges, and applications. Prod. Manuf. Res. 4(1), 23–45 (2016). https://doi.org/10.1080/21693277.2016.1192517

    Article  Google Scholar 

  49. Yano, C.: Setting planning lead times in serial production systems with earliness costs. Manag. Sci. 33(1), 95–106 (1987)

    Article  Google Scholar 

  50. Zhang, G.P.: Avoiding pitfalls in neural network research. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 37(1), 3–16 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manuel Schneckenreither .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Schneckenreither, M., Haeussler, S. (2019). Reinforcement Learning Methods for Operations Research Applications: The Order Release Problem. In: Nicosia, G., Pardalos, P., Giuffrida, G., Umeton, R., Sciacca, V. (eds) Machine Learning, Optimization, and Data Science. LOD 2018. Lecture Notes in Computer Science(), vol 11331. Springer, Cham. https://doi.org/10.1007/978-3-030-13709-0_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-13709-0_46

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-13708-3

  • Online ISBN: 978-3-030-13709-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics