Skip to main content

Reinforcement Learning for Inventory Management

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Mechanical Engineering ((LNME))

Abstract

The decision of “how much to order” at each stage of the supply chain is a major task to minimize inventory costs. Managers tend to follow particular ordering policy seeking individual benefit which hampers the overall performance of the supply chain. Major findings from the literature show that, with the advent of machine learning and artificial intelligence, the trend in this area has been heading from simple base stock policy to intelligence-based learning algorithms to gain near-optimal solution. This paper initially focuses on formulating a multi-agent four-stage serial supply chain as reinforcement learning (RL) model for ordering management problem. In the final step, RL model for a single-agent supply chain is optimized using Q-learning algorithm. The results from the simulations show that the RL model with Q-learning algorithm is found to be better than Order-Up-To policy and 1–1 policy.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Lee HL, Padmanabhan V, Whang S (1997) Information distortion in a supply chain: the bullwhip effect. Manag Sci 43(4):546–558

    Article  Google Scholar 

  2. Sterman JD (1989) Modeling managerial behavior: misperceptions of feedback in a dynamic decision making experiment. Manag Sci 35(3):321–339

    Article  Google Scholar 

  3. Claus C, Boutilier C (1998) The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the fifteenth national conference on artificial intelligence. AAAI, Madison, Wisconsin, pp 746–752

    Google Scholar 

  4. Forester JW (1961) Industrial dynamics, 1st edn. MIT Press; Wiley, New York

    Google Scholar 

  5. Chaharsooghi SK, Heydari J, Zegordi SH (2008) A reinforcement learning model for supply chain ordering management: an application to the beer game. Decis Support Syst 45(4):949–959

    Article  Google Scholar 

  6. Clark AJ, Scarf H (1960) Optimal policies for a multi-echelon inventory problem. Manag Sci 6(4):475–490

    Article  Google Scholar 

  7. Kimbrough SO, Wu DJ, Zhong F (2002) Computers play the beer game: can artificial agents manage supply chains? Decis Support Syst 33(3):323–333

    Article  Google Scholar 

  8. Mosekilde E, Larsen ER (1986) Deterministic chaos in the beer production-distribution model. Syst Dyn Rev 4(1–2):131–147

    Google Scholar 

  9. Strozzi F, Bosch J, Zaldivar JM (2007) Beer game order policy optimization under changing customer demand. Decis Support Syst 42(4):2153–2163

    Article  Google Scholar 

  10. Edali M, Yasarcan H (2016) Results of a beer game experiment: should a manager always behave according to the book? Complexity 21(S1):190–199

    Article  MathSciNet  Google Scholar 

  11. Gosavi A (2009) Reinforcement learning: a tutorial survey and recent advances. INFORMS J Comput 21(2):178–192

    Article  MathSciNet  Google Scholar 

  12. Pontrandolfo P, Gosavi A, Okogbaa OG, Das TK (2002) Global supply chain management: a reinforcement learning approach. Int J Prod Res 40(6):1299–1317

    Article  Google Scholar 

  13. Giannoccaro I, Pontrandolfo P (2002) Inventory management in supply chains: a reinforcement learning approach. Int J Prod Econ 78(2):153–161

    Article  Google Scholar 

  14. Kara A, Dogan I (2017) Reinforcement learning approaches for specifying ordering policies of perishable inventory systems. Expert Syst Appl 91:150

    Article  Google Scholar 

  15. Oroojlooyjadid A, Nazari M, Snyder L, Takáč M (2017) A deep Q-network for the beer game: a reinforcement learning algorithm to solve inventory optimization problems. arXiv preprint arXiv:1708.05924 [cs. LG]

  16. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction, 1st edn. MIT Press, Cambridge

    MATH  Google Scholar 

  17. Puterman ML (1994) Markov decision processes: Discrete stochastic dynamic programming. Wiley, New York

    Book  Google Scholar 

  18. Daniel JSR, Rajendran C (2005) A simulation-based genetic algorithm for inventory optimization in a serial supply chain. Int Trans Oper Res 12(1):101–127

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. Madhusudanan Pillai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bharti, S., Kurian, D.S., Pillai, V.M. (2020). Reinforcement Learning for Inventory Management. In: Deepak, B., Parhi, D., Jena, P. (eds) Innovative Product Design and Intelligent Manufacturing Systems. Lecture Notes in Mechanical Engineering. Springer, Singapore. https://doi.org/10.1007/978-981-15-2696-1_85

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-2696-1_85

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-2695-4

  • Online ISBN: 978-981-15-2696-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics