Abstract
An approximate dynamic programming that incorporates a combined policy, value function approximation and lookahead policy, is proposed. The algorithm is validated by applying it to solve a set of instances of the nurse rostering problem tackled as a multi-stage problem. In each stage of the problem, a weekly roster is constructed taking into consideration historical information about the nurse rosters in the previous week and assuming the future demand for the following weeks as unknown. The proposed method consists of three phases. First, a pre-process phase generates a set of valid shift patterns. Next, a local phase solves the weekly optimization problem using value function approximation policy. Finally, the global phase uses lookahead policy to evaluate the weekly rosters within a lookahead period. Experiments are conducted using instances from the Second International Nurse Rostering Competition and results indicate that the method is able to solve large instances of the problem which was not possible with a previous version of approximate dynamic programming.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Puterman, M.L.: Markov decision processes: discrete stochastic dynamic programming. Wiley, New York (2014)
Powell, W.B.: Approximate Dynamic Programming: Solving the curses of dimensionality, vol. 703. Wiley, Hoboken (2007)
Burke, E.K., de Causmaecker, P., Berghe, G.V., van Landeghem, H.: The state of the art of nurse rostering. J. Sched. 7(6), 441–499 (2004)
Ceschia, S., Dang, N.T.T., de Causmaecker, P., Haspeslagh, S., Schaerf, A.: Second international nurse rostering competition (INRC-II)—problem description and rules—. arXiv preprint arXiv:1501.04177 (2015)
Tesauro, G.: Practical issues in temporal difference learning. In: Sutton, R.S. (ed.) Reinforcement Learning. The Springer International Series in Engineering and Computer Science (Knowledge Representation, Learning and Expert Systems), vol. 173, pp. 33–53. Springer, Boston (1992). https://doi.org/10.1007/978-1-4615-3618-5_3
Shi, P., Landa-Silva, D.: Dynamic programming with approximation function for nurse scheduling. In: Pardalos, P.M., Conca, P., Giuffrida, G., Nicosia, G. (eds.) MOD 2016. LNCS, vol. 10122, pp. 269–280. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-51469-7_23
Dang, N.T.T., Ceschia, S., Schaerf, A., de Causmaecker, P., Haspeslagh, S.: Solving the multi-stage nurse rostering problem. In: Proceedings of the 11th International Conference of the Practice and Theory of Automated Timetabling, pp. 473–475 (2016)
INRC-II the second nurse rostering competition. http://mobiz.vives.be/inrc2/. Accessed 23 May 2016
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Shi, P., Landa-Silva, D. (2018). Approximate Dynamic Programming with Combined Policy Functions for Solving Multi-stage Nurse Rostering Problem. In: Nicosia, G., Pardalos, P., Giuffrida, G., Umeton, R. (eds) Machine Learning, Optimization, and Big Data. MOD 2017. Lecture Notes in Computer Science(), vol 10710. Springer, Cham. https://doi.org/10.1007/978-3-319-72926-8_29
Download citation
DOI: https://doi.org/10.1007/978-3-319-72926-8_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-72925-1
Online ISBN: 978-3-319-72926-8
eBook Packages: Computer ScienceComputer Science (R0)