Bounds for Multistage Stochastic Programs Using Supervised Learning Strategies

  • Boris Defourny
  • Damien Ernst
  • Louis Wehenkel
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5792)


We propose a generic method for obtaining quickly good upper bounds on the minimal value of a multistage stochastic program. The method is based on the simulation of a feasible decision policy, synthesized by a strategy relying on any scenario tree approximation from stochastic programming and on supervised learning techniques from machine learning.


Markov Decision Process Scenario Tree Multivariate Adaptive Regression Spline Supervise Learning Algorithm Stochastic Programming Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Frauendorfer, K.: Barycentric scenario trees in convex multistage stochastic programming. Mathematical Programming 75, 277–294 (1996)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Dempster, M.: Sequential importance sampling algorithms for dynamic stochastic programming. Annals of Operations Research 84, 153–184 (1998)Google Scholar
  3. 3.
    Dupacova, J., Consigli, G., Wallace, S.: Scenarios for multistage stochastic programs. Annals of Operations Research 100, 25–53 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Høyland, K., Wallace, S.: Generating scenario trees for multistage decision problems. Management Science 47(2), 295–307 (2001)CrossRefzbMATHGoogle Scholar
  5. 5.
    Shapiro, A.: Monte Carlo sampling methods. In: Ruszczyński, A., Shapiro, A. (eds.) Stochastic Programming. Handbooks in Operations Research and Management Science, vol. 10, pp. 353–425. Elsevier, Amsterdam (2003)CrossRefGoogle Scholar
  6. 6.
    Casey, M., Sen, S.: The scenario generation algorithm for multistage stochastic linear programming. Mathematics of Operations Research 30, 615–631 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Hochreiter, R., Pflug, G.: Financial scenario generation for stochastic multi-stage decision processes as facility location problems. Annals of Operations Research 152, 257–272 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Pennanen, T.: Epi-convergent discretizations of multistage stochastic programs via integration quadratures. Mathematical Programming 116, 461–479 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Heitsch, H., Römisch, W.: Scenario tree modeling for multistage stochastic programs. Mathematical Programming 118(2), 371–406 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Shapiro, A.: On complexity of multistage stochastic programs. Operations Research Letters 34(1), 1–8 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Shapiro, A.: Inference of statistical bounds for multistage stochastic programming problems. Mathematical Methods of Operations Research 58(1), 57–68 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Golub, B., Holmer, M., McKendall, R., Pohlman, L., Zenios, S.: A stochastic programming model for money management. European Journal of Operational Research 85, 282–296 (1995)CrossRefzbMATHGoogle Scholar
  13. 13.
    Kouwenberg, R.: Scenario generation and stochastic programming models for asset liability management. European Journal of Operational Research 134, 279–292 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Hilli, P., Pennanen, T.: Numerical study of discretizations of multistage stochastic programs. Kybernetika 44, 185–204 (2008)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Billingsley, P.: Probability and Measure, 3rd edn. Wiley, Chichester (1995)zbMATHGoogle Scholar
  16. 16.
    Vapnik, V.: Statistical Learning Theory. Wiley, Chichester (1998)zbMATHGoogle Scholar
  17. 17.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, Heidelberg (2009)CrossRefzbMATHGoogle Scholar
  18. 18.
    Wahba, G., Golub, G., Heath, M.: Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 21, 215–223 (1979)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Efron, B., Tibshirani, R.: An introduction to the bootstrap. Chapman and Hall, London (1993)CrossRefzbMATHGoogle Scholar
  20. 20.
    Thénié, J., Vial, J.P.: Step decision rules for multistage stochastic programming: A heuristic approach. Automatica 44, 1569–1584 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Küchler, C., Vigerske, S.: Numerical evaluation of approximation methods in stochastic programming (2008) (submitted)Google Scholar
  22. 22.
    Cover, T.: Estimation by the nearest neighbor rule. IEEE Transactions on Information Theory 14, 50–55 (1968)CrossRefzbMATHGoogle Scholar
  23. 23.
    Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Proceedings of the Second International Symposium on Information Theory, pp. 267–281 (1973)Google Scholar
  24. 24.
    Schwartz, G.: Estimating the dimension of a model. Annals of Statistics 6, 461–464 (1978)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Rissanen, J.: Stochastic complexity and modeling. Annals of Statistics 14, 1080–1100 (1986)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    James, G., Radchenko, P., Lv, J.: DASSO: connections between the Dantzig selector and Lasso. Journal of the Royal Statistical Society: Series B 71, 127–142 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Chapelle, O., Vapnik, V., Bengio, Y.: Model selection for small sample regression. Machine Learning 48, 315–333 (2002)CrossRefzbMATHGoogle Scholar
  28. 28.
    Huber, P.: Projection pursuit. Annals of Statistics 13, 435–475 (1985)MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Buja, A., Hastie, T., Tibshirani, R.: Linear smoothers and additive models. Annals of Statistics 17, 453–510 (1989)MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Friedman, J.: Multivariate adaptive regression splines (with discussion). Annals of Statistics 19, 1–141 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Girosi, F., Jones, M., Poggio, T.: Regularization theory and neural networks architectures. Neural Computation 7, 219–269 (1995)CrossRefGoogle Scholar
  32. 32.
    Williams, C., Rasmussen, C.: Gaussian processes for regression. In: Advances in Neural Information Processing Systems 8 (NIPS 1995), pp. 514–520 (1996)Google Scholar
  33. 33.
    Smola, A., Schölkopf, B., Müller, K.R.: The connection between regularization operators and support vector kernels. Neural Networks 11, 637–649 (1998)CrossRefGoogle Scholar
  34. 34.
    Puterman, M.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, Chichester (1994)CrossRefzbMATHGoogle Scholar
  35. 35.
    Bertsekas, D., Tsitsiklis, J.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)zbMATHGoogle Scholar
  36. 36.
    Sutton, R., Barto, A.: Reinforcement Learning, an introduction. MIT Press, Cambridge (1998)Google Scholar
  37. 37.
    Bagnell, D., Kakade, S., Ng, A., Schneider, J.: Policy search by dynamic programming. In: Advances in Neural Information Processing Systems 16 (NIPS 2003), pp. 831–838 (2004)Google Scholar
  38. 38.
    Lagoudakis, M., Parr, R.: Reinforcement learning as classification: leveraging modern classifiers. In: Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003), pp. 424–431 (2003)Google Scholar
  39. 39.
    Ernst, D., Geurts, P., Wehenkel, L.: Tree-based batch mode reinforcement learning. Journal of Machine Learning Research 6, 503–556 (2005)MathSciNetzbMATHGoogle Scholar
  40. 40.
    Langford, J., Zadrozny, B.: Relating reinforcement learning performance to classification performance. In: Proceedings of the Twenty-Second International Conference on Machine Learning (ICML 2005), pp. 473–480 (2005)Google Scholar
  41. 41.
    Fern, A., Yoon, S., Givan, R.: Approximate policy iteration with a policy language bias: solving relational Markov Decision Processes. Journal of Artificial Intelligence Research 25, 85–118 (2006)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Boris Defourny
    • 1
  • Damien Ernst
    • 1
  • Louis Wehenkel
    • 1
  1. 1.University of Liège, Systems and Modeling, B28LiègeBelgium

Personalised recommendations