Scenario reduction revisited: fundamental limits and guarantees

  • Napat Rujeerapaiboon
  • Kilian Schindler
  • Daniel Kuhn
  • Wolfram Wiesemann
Full Length Paper Series B


The goal of scenario reduction is to approximate a given discrete distribution with another discrete distribution that has fewer atoms. We distinguish continuous scenario reduction, where the new atoms may be chosen freely, and discrete scenario reduction, where the new atoms must be chosen from among the existing ones. Using the Wasserstein distance as measure of proximity between distributions, we identify those n-point distributions on the unit ball that are least susceptible to scenario reduction, i.e., that have maximum Wasserstein distance to their closest m-point distributions for some prescribed \(m<n\). We also provide sharp bounds on the added benefit of continuous over discrete scenario reduction. Finally, to our best knowledge, we propose the first polynomial-time constant-factor approximations for both discrete and continuous scenario reduction as well as the first exact exponential-time algorithms for continuous scenario reduction.


Scenario reduction Wasserstein distance Constant-factor approximation algorithm k-median clustering k-means clustering 



The authors are indebted to the referees and the guest editors for their comments that considerably improved the manuscript. This research was funded by the SNSF Grant BSCGI0_157733 and the EPSRC Grants EP/M028240/1 and EP/M027856/1.


  1. 1.
    Alizadeh, F., Goldfarb, D.: Second-order cone programming. Math. Program. 95(1), 3–51 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Aloise, D., Deshpande, A., Hansen, P., Popat, P.: NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn. 75(2), 245–248 (2009)CrossRefzbMATHGoogle Scholar
  3. 3.
    Arya, V., Garg, N., Khandekar, R., Meyerson, A., Munagala, K., Pandit, V.: Local search heuristics for \(k\)-median and facility location problems. SIAM J. Comput. 33(3), 544–562 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Charikar, M., Li, S.: A dependent LP-rounding approach for the \(k\)-median problem. In: Proceedings of the 39th International Colloquium Conference on Automata, Languages, and Programming, pp. 194–205 (2012)Google Scholar
  5. 5.
    Conejo, A., Carrión, M., Morales, J.: Decision Making Under Uncertainty in Electricity Markets. Springer, Berlin (2010)CrossRefzbMATHGoogle Scholar
  6. 6.
    Dasgupta, S.: CSE 291: Topics in unsupervised learning. (2008)
  7. 7.
    Drezner, Z., Hamacher, H.: Facility Location: Applications and Theory. Springer, Berlin (2004)zbMATHGoogle Scholar
  8. 8.
    Dupačová, J.: Stability and sensitivity-analysis for stochastic programming. Ann. Oper. Res. 27(1), 115–142 (1990)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Dupačová, J., Gröwe-Kuska, N., Römisch, W.: Scenario reduction in stochastic programming: an approach using probability metrics. Math. Program. 95(3), 493–511 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Gao, R., Kleywegt, A.: Distributionally robust stochastic optimization with Wasserstein distance. (2016) arXiv:1604.02199
  11. 11.
    Graf, S., Luschgy, H.: Foundations of Quantization for Probability Distributions. Springer, Berlin (2000)CrossRefzbMATHGoogle Scholar
  12. 12.
    Gray, R.M.: Toeplitz and Circulant Matrices: A Review. Now Publishers, Breda (2006)zbMATHGoogle Scholar
  13. 13.
    Hanasusanto, G., Kuhn, D.: Conic programming reformulations of two-stage distributionally robust linear programs over Wasserstein balls. (2016) arXiv:1609.07505
  14. 14.
    Hanasusanto, G., Kuhn, D., Wiesemann, W.: \(k\)-adaptability in two-stage robust binary programming. Oper. Res. 63(4), 877–891 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Heitsch, H., Römisch, W.: Scenario reduction algorithms in stochastic programming. Comput. Optim. Appl. 24(2), 187–206 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Heitsch, H., Römisch, W.: A note on scenario reduction for two-stage stochastic programs. Oper. Res. Lett. 35(6), 731–738 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Hochreiter, R., Pflug, G.: Financial scenario generation for stochastic multi-stage decision processes as facility location problems. Ann. Oper. Res. 152(1), 257–272 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Hopcroft, J., Kannan, R.: Computer science theory for the information age. (2012)
  19. 19.
    Jain, A.: Data clustering: 50 years beyond \(k\)-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)CrossRefGoogle Scholar
  20. 20.
    Jain, A., Murty, M., Flynn, P.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)CrossRefGoogle Scholar
  21. 21.
    Kanungo, T., Mount, D., Netanyahu, N., Piatko, C., Silverman, R., Wu, A.: A local search approximation algorithm for \(k\)-means clustering. Comput. Geom. 28(2), 89–112 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Kariv, O., Hakimi, S.: An algorithmic approach to network location problems. II: The \(p\)-medians. SIAM J. Appl. Math. 37(3), 539–560 (1979)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Li, S., Svensson, O.: Approximating \(k\)-median via pseudo-approximation. SIAM J. Comput. 45(2), 530–547 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Löhndorf, N.: An empirical analysis of scenario generation methods for stochastic optimization. Eur. J. Oper. Res. 255(1), 121–132 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Mahajan, M., Nimbhorkar, P., Varadarajan, K.: The planar \(k\)-means problem is NP-hard. In: Proceedings of the 3rd International Workshop on Algorithms and Computation, pp. 274–285 (2009)Google Scholar
  27. 27.
    Mohajerin Esfahani, P., Kuhn, D.: Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations. Math. Program. (2017).
  28. 28.
    Parvania, M., Fotuhi-Firuzabad, M.: Demand response scheduling by stochastic SCUC. IEEE Trans. Smart Grid 1(1), 89–98 (2010)CrossRefGoogle Scholar
  29. 29.
    Pflug, G.: Scenario tree generation for multiperiod financial optimization by optimal discretization. Math. Program. 89(2), 251–271 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Pflug, G., Pichler, A.: Approximations for probabilitydistributions and stochastic optimization problems. In: Bertocchi, M., Consigli, G., Dempster, M.A.H. (eds.) Stochastic Optimization Methods in Finance and Energy: New Financial Products and Energy Market Strategies, pp. 343–387. Springer, Berlin (2011)CrossRefGoogle Scholar
  31. 31.
    Rachev, S., Römisch, W.: Quantitative stability in stochastic programming: the method of probability metrics. Math. Oper. Res. 27(4), 792–818 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  32. 32.
    Römisch, W., Schultz, R.: Stability analysis for stochastic programs. Ann. Oper. Res. 30(1), 241–266 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    Römisch, W., Wets, R.: Stability of \(\varepsilon \)-approximate solutions to convex stochastic programs. SIAM J. Optim. 18(3), 961–979 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Rubinstein, R., Kroese, D.: Simulation and the Monte Carlo Method. Wiley, Hoboken (2007)CrossRefzbMATHGoogle Scholar
  35. 35.
    Ruiz, P., Philbrick, C., Zak, E., Cheung, K., Sauer, P.: Uncertainty management in the unit commitment problem. IEEE Trans. Power Syst. 24(2), 642–651 (2009)CrossRefGoogle Scholar
  36. 36.
    Steele, M.: The Cauchy–Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities. Cambridge University Press, Cambridge (2004)CrossRefzbMATHGoogle Scholar
  37. 37.
    Stockbridge, R., Bayraksan, G.: A probability metrics approach for reducing the bias of optimality gap estimators in two-stage stochastic linear programming. Math. Program. 142(1), 107–131 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  38. 38.
    Zhao, C., Guan, Y.: Data-driven risk-averse stochastic optimization with Wasserstein metric. (2015) Available on Optimization OnlineGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society 2018

Authors and Affiliations

  1. 1.Risk Analytics and Optimization ChairÉcole Polytechnique Fédérale de LausanneLausanneSwitzerland
  2. 2.Imperial College Business SchoolImperial College LondonLondonUnited Kingdom

Personalised recommendations