Adaptive State Space Partitioning for Dynamic Decision Processes

  • Ninja SoeffkerEmail author
  • Marlin W. Ulmer
  • Dirk C. Mattfeld
Research Paper


With the rise of new business processes that require real-time decision making, anticipatory decision making becomes necessary to use the available resources wisely. Dynamic real-time problems occur in many business fields, for example in vehicle routing applications with stochastic customer service requests expecting a fast response. For anticipatory decision making, offline simulation-based optimization methods like value function approximation are promising solution approaches. However, these methods require a suitable approximation architecture to store the value information for the problem states. In this paper, an approach is proposed that finds and adapts this architecture iteratively during the approximation process. A computational proof of concept is presented for a dynamic vehicle routing problem. In comparison to conventional architectures, the proposed method is able to improve the solution quality and reduces the required architecture size significantly.


Approximate dynamic programming Dynamic service routing State space partitioning Data-driven modeling and simulation Simulation-based optimization 


  1. Bellman R (1957) A Markovian decision process. Technical report, DTIC documentGoogle Scholar
  2. Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, BelmontGoogle Scholar
  3. Ikonen E, Selek I, Najim K (2016) Process control using finite Markov chains with iterative clustering. Comput Chem Eng 93:293–308CrossRefGoogle Scholar
  4. Jarke M (2014) Interview with Michael Feindt on prescriptive big data analytics. Bus Inf Syst Eng 6(5):301–302CrossRefGoogle Scholar
  5. Jin Z, Liu W, Jin J (2009) Partitioning the state space by critical states. In: Fourth international conference on bio-inspired computing, 2009. BIC-TA’09. IEEE, pp 1–7Google Scholar
  6. Kishima Y, Kurashige K (2013) Reduction of state space in reinforcement learning by sensor selection. Artif Life Robot 18(1–2):7–14CrossRefGoogle Scholar
  7. Kowalczyk M, Buxmann P (2014) Big data and information processing in organizational decision processes. Bus Inf Syst Eng 6(5):267–278CrossRefGoogle Scholar
  8. Larsen A, Madsen OBG, Solomon MM (2002) Partially dynamic vehicle routing-models and algorithms. J Oper Res Soc 53(6):637–646CrossRefGoogle Scholar
  9. Lee IS, Lau HY (2004) Adaptive state space partitioning for reinforcement learning. Eng Appl Artif Intell 17(6):577–588CrossRefGoogle Scholar
  10. Powell WB (2011) Approximate dynamic programming: solving the curses of dimensionality, vol 842. Wiley series in probability and statistics. Wiley, New YorkCrossRefGoogle Scholar
  11. Puterman ML (2014) Markov decision processes: discrete stochastic dynamic programming. Wiley, New YorkGoogle Scholar
  12. Sarkar S, Subramaniam A, Neogi R (2000) Study of methods for model reduction in transition systems. In: 2000 IEEE international conference on systems, man, and cybernetics, vol 1. IEEE, pp 172–176Google Scholar
  13. Savelsbergh M, Van Woensel T (2016) 50th anniversary invited article – city logistics: challenges and opportunities. Transp Sci 50(2):579–590CrossRefGoogle Scholar
  14. Sherstov AA, Stone P (2005) Function approximation via tile coding: automating parameter choice. In: International symposium on abstraction, reformulation, and approximation. Springer, Berlin, pp 194–205Google Scholar
  15. Soeffker N, Ulmer MW, Mattfeld DC (2016) Problem-specific state space partitioning for dynamic vehicle routing problems. In: Nissen V, Stelzer D, Straßburger S, Fischer D (eds) Proceedings of Multikonferenz Wirtschaftsinformatik (MKWI) 2016. Universitätsverlag Ilmenau, Ilmenau, pp 229–240Google Scholar
  16. Speranza MG (2018) Trends in transportation and logistics. Eur J Oper Res 264(3):830–836CrossRefGoogle Scholar
  17. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction, vol 1. MIT Press, CambridgeGoogle Scholar
  18. Thomas BW (2007) Waiting strategies for anticipating service requests from known customer locations. Transp Sci 41(3):319–331CrossRefGoogle Scholar
  19. Ulmer MW, Brinkmann J, Mattfeld DC (2015) Anticipatory planning for courier, express and parcel services. In: Dethloff J, Haasis HD, Kopfer H, Kotzab H, Schönberger J (eds) Logistics management. Springer, Cham, pp 313–324Google Scholar
  20. Ulmer MW, Mattfeld DC, Köster F (2017) Budgeting time for dynamic vehicle routing with stochastic customer requests. Transp Sci 52(1):20–37CrossRefGoogle Scholar
  21. Ulmer MW, Soeffker N, Mattfeld DC (2018) Value function approximation for dynamic multi-period vehicle routing. Eur J Oper Res 269(3):883–899CrossRefGoogle Scholar
  22. Whiteson S, Taylor ME, Stone P (2007) Adaptive tile coding for value function approximation. Computer Science Department, University of Texas at Austin, AustinGoogle Scholar

Copyright information

© Springer Fachmedien Wiesbaden GmbH, ein Teil von Springer Nature 2019

Authors and Affiliations

  • Ninja Soeffker
    • 1
    Email author
  • Marlin W. Ulmer
    • 1
  • Dirk C. Mattfeld
    • 1
  1. 1.Technische Universität BraunschweigBraunschweigGermany

Personalised recommendations