Approximations of semicontinuous functions with applications to stochastic optimization and statistical estimation

  • Johannes O. RoysetEmail author
Full Length Paper Series A


Upper semicontinuous (usc) functions arise in the analysis of maximization problems, distributionally robust optimization, and function identification, which includes many problems of nonparametric statistics. We establish that every usc function is the limit of a hypo-converging sequence of piecewise affine functions of the difference-of-max type and illustrate resulting algorithmic possibilities in the context of approximate solution of infinite-dimensional optimization problems. In an effort to quantify the ease with which classes of usc functions can be approximated by finite collections, we provide upper and lower bounds on covering numbers for bounded sets of usc functions under the Attouch-Wets distance. The result is applied in the context of stochastic optimization problems defined over spaces of usc functions. We establish confidence regions for optimal solutions based on sample average approximations and examine the accompanying rates of convergence. Examples from nonparametric statistics illustrate the results.


Hypo-convergence Attouch-Wets distance Approximation theory Solution stability Stochastic optimization Epi-splines Rate of convergence 

Mathematics Subject Classification

90C15 Stochastic programming 62G07 Density estimation 62G08 Nonparametric regression 62G15 Tolerance and confidence regions 



This work in supported in parts by DARPA under Grants HR0011-14-1-0060 and HR0011-8-34187, and Office of Naval Research (Science of Autonomy Program) under Grant N00014- 17-1-2372.


  1. 1.
    Balabdaoui, F., Wellner, J.A.: Estimation of a k-monotone density: characterizations, consistency and minimax lower bounds. Stat. Neerl. 64(1), 45–70 (2010)MathSciNetGoogle Scholar
  2. 2.
    Bampou, D., Kuhn, D.: Polynomial approximations for continuous linear programs. SIAM J. Optim. 22, 628–648 (2012)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Bartlett, P.L., Kulkarni, S.R., Posner, S.E.: Covering numbers for real-valued function classes. IEEE Trans. Inf. Theory 43(5), 1721–1724 (1997)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Bayraksan, G., Morton, D.P.: Assessing solution quality in stochastic programs. Math. Program. 108, 495–514 (2006)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Birman, M.S., Solomjak, M.Z.: Piecewise-polynomial approximation of functions of the classes \(w_p^\alpha \). Math. USSR Sbornik 73, 295–317 (1967)Google Scholar
  6. 6.
    Bronshtein, E.M.: \(\epsilon \)-Entropy of convex sets and functions. Sib. Math. J. 17(3), 393–398 (1976)MathSciNetGoogle Scholar
  7. 7.
    Brudnyi, A.: On covering numbers of sublevel sets of analytic functions. J. Approx. Theory 162(1), 72–93 (2010)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Cui, Y., Pang, J.-S., Sen, B.: Composite difference-max programs for modern statistical estimation problems. SIAM J. Optim. 28(4), 3344–3374 (2018)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Cule, M., Samworth, R.J., Stewart, M.: Maximum likelihood estimation of a multi-dimensional log-concave density. J. R. Stat. Soc. Ser. B 72, 545–600 (2010)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Devolder, O., Glineur, F., Nesterov, Y.: Solving infinite-dimensional optimization problems by polynomial approximation. In: Diehl, M., Glineur, F., Jarlebring, E., Michiels, W. (eds.) Recent Advances in Optimization and its Applications in Engineering, pp. 31–40. Springer, Berlin (2010)Google Scholar
  11. 11.
    Dudley, R.M.: Metric entropy of some classes of sets with differentiable boundaries. J. Approx. Theory 10(3), 227–236 (1974)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Georghiou, A., Wiesemann, W., Kuhn, D.: Generalized decision rule approximations for stochastic programming via liftings. Math. Program. 152(1–2), 301–338 (2015)MathSciNetzbMATHGoogle Scholar
  13. 13.
    Groeneboom, P., Jongbloed, G., Wellner, J.A.: Estimation of a convex function: characterizations and asymptotic theory. Ann. Stat. 29, 1653–1698 (2001)MathSciNetzbMATHGoogle Scholar
  14. 14.
    Guntuboyina, A., Sen, B.: Covering numbers for convex functions. IEEE Trans. Inf. Theory 59(4), 1957–1965 (2013)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Guntuboyina, A., Sen, B.: Global risk bounds and adaptation in univariate convex regression. Probab. Theory Relat. Fields 163, 379–411 (2015)MathSciNetzbMATHGoogle Scholar
  16. 16.
    Guo, Y., Bartlett, P.L., Shawe-Taylor, J., Williamson, R.C.: Covering numbers for support vector machines. IEEE Trans. Inf. Theory 48(1), 239–250 (2002)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Hanasusanto, G.A., Wiesemann, W., Kuhn, D.: K-adaptability in two-stage robust binary programming. Oper. Res. 63(4), 877–891 (2015)MathSciNetzbMATHGoogle Scholar
  18. 18.
    Hartman, P.: On functions representable as a difference of convex functions. Pac. J. Math. 9, 707–713 (1959)MathSciNetzbMATHGoogle Scholar
  19. 19.
    Higle, J.L., Sen, S.: Statistical verification of optimality conditions for stochastic programs with recourse. Ann. Oper. Res. 30, 215–240 (1991)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Higle, J.L., Sen, S.: Duality and statistical tests of optimality for two stage stochastic programs. Math. Program. 75, 257–275 (1996)MathSciNetzbMATHGoogle Scholar
  21. 21.
    Horst, R., Thoai, N.V.: DC programming: overview. J. Optim. Theory Appl. 103(1), 1–43 (1999)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Kim, A.K.H., Samworth, R.J.: Global rates of convergence in log-concave density estimation. Ann. Stat. 44, 2756–2779 (2016)MathSciNetzbMATHGoogle Scholar
  23. 23.
    Kolmogorov, A.N., Tikhomirov, V.M.: Epsilon-entropy and epsilon-capacity of sets in functional spaces. Am. Math. Soc. Transl. Ser. 2(17), 277–364 (1961)Google Scholar
  24. 24.
    Kühn, T.: Covering numbers of Gaussian reproducing kernel Hilbert spaces. J. Complex. 27(5), 489–499 (2011)MathSciNetzbMATHGoogle Scholar
  25. 25.
    Lamm, M., Lu, S.: Generalized conditioning based approaches to computing confidence intervals for solutions to stochastic variational inequalities. Math. Program. B 174, 99–127 (2018)MathSciNetzbMATHGoogle Scholar
  26. 26.
    Lu, S., Liu, Y., Yin, L., Zhang, K.: Confidence intervals and regions for the lasso by using stochastic variational inequality techniques in optimization. J. R. Stat. Soc. Ser. B 79, 589–611 (2017)MathSciNetzbMATHGoogle Scholar
  27. 27.
    Mak, W.K., Morton, D.P., Wood, R.K.: Monte Carlo bounding techniques for determining solution quality in stochastic programs. Oper. Res. Lett. 24, 47–56 (1999)MathSciNetzbMATHGoogle Scholar
  28. 28.
    Miller, M.: Binary classification using piecewise affine functions. Master’s thesis, Naval Postgraduate School, Monterey, CA, June (2019)Google Scholar
  29. 29.
    Norkin, V.I., Pflug, G.C., Ruszczynski, A.: A branch and bound method for stochastic global optimization. Math. Program. 83, 425–450 (1998)MathSciNetzbMATHGoogle Scholar
  30. 30.
    Pontil, M.: A note on different covering numbers in learning theory. J. Complex. 19(5), 665–671 (2003)MathSciNetzbMATHGoogle Scholar
  31. 31.
    Rockafellar, R.T., Wets, R. J-B.: Variational Analysis, Grundlehren der Mathematischen Wissenschaft, vol. 317. Springer, Berlin (1998). (3rd printing-2009 edition)Google Scholar
  32. 32.
    Royset, J.O.: Optimality functions in stochastic programming. Math. Program. 135(1), 293–321 (2012)MathSciNetzbMATHGoogle Scholar
  33. 33.
    Royset, J.O.: Approximations and solution estimates in optimization. Math. Program. 170(2), 479–506 (2018)MathSciNetzbMATHGoogle Scholar
  34. 34.
    Royset, J.O., Wets, R.J.-B.: From data to assessments and decisions: epi-spline technology. In: Newman, A. (ed.) INFORMS Tutorials. INFORMS, Catonsville (2014)Google Scholar
  35. 35.
    Royset, J.O., Wets, R.J.-B.: Multivariate epi-splines and evolving function identification problems. Set-Valued and Variational Analysis 24(4), 517–545 (2016). (Erratum: pp. 547–549)MathSciNetzbMATHGoogle Scholar
  36. 36.
    Royset, J.O., Wets, R.J.-B.: Variational theory for optimization under stochastic ambiguity. SIAM J. Optim. 27(2), 1118–1149 (2017)MathSciNetzbMATHGoogle Scholar
  37. 37.
    Royset, J.O., Wets, R.J.-B.: On univariate function identification problems. Math. Program. B 168(1–2), 449–474 (2018)MathSciNetzbMATHGoogle Scholar
  38. 38.
    Royset, J.O., Wets, R.J.-B.: Variational analysis of constrained M-estimators. ArXiv e-prints (2018)Google Scholar
  39. 39.
    Salinetti, G., Wets, R.J.-B.: On the convergence in distribution of measurable multifunctions (random sets), normal integrands, stochastic processes and stochastic infima. Math. Oper. Res. 11(3), 385–419 (1986)MathSciNetzbMATHGoogle Scholar
  40. 40.
    Salinetti, G., Wets, R.J.-B.: On the hypo-convergence of probability measures. In: Conti, R., De Giorgi, E., Gianessi, F. (eds.) Optimication and Related Fields, Proceedings, Erice 1984, Lecture Notes in Mathematics, vol. 1190, pp. 371–395. Springer, Berlin (1986)Google Scholar
  41. 41.
    Seijo, E., Sen, B.: Nonparametric least squares estimation of a multivariate convex regression. Ann. Stat. 39, 1633–1657 (2011)MathSciNetzbMATHGoogle Scholar
  42. 42.
    Shapiro, A., Dentcheva, D., Ruszczynski, A.: Lectures on Stochastic Programming: Modeling and Theory, 2nd edn. SIAM, Philadelphia (2014)zbMATHGoogle Scholar
  43. 43.
    Shapiro, A., Homem-de-Mello, T.: A simulation-based approach to two-stage stochastic programming with recourse. Math. Program. 81, 301–325 (1998)MathSciNetzbMATHGoogle Scholar
  44. 44.
    van der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes. Springer, Berlin (1996). (2nd printing 2000 edition)zbMATHGoogle Scholar
  45. 45.
    van de Geer, S.: Empirical Processes in M-Estimation. Cambridge University Press, Cambridge (2000)Google Scholar
  46. 46.
    Wang, J., Huang, H., Luo, Z., Chen, B.: Estimation of covering number in learning theory. In: Proceeding of the Fifth International Conference on Semantics, Knowledge and Grid 2009, pp. 388–391 (2009)Google Scholar
  47. 47.
    Zhang, Z., Yang, X., Oseledets, I.V., Karniadakis, G.E., Daniel, L.: Enabling high-dimensional hierarchical uncertainty quantification by anova and tensor-train decomposition. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 34(1), 63–76 (2015)Google Scholar
  48. 48.
    Zhou, D.-X.: The covering number in learning theory. J. Complex. 18(3), 739–767 (2002)MathSciNetzbMATHGoogle Scholar

Copyright information

© This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2019

Authors and Affiliations

  1. 1.Operations Research DepartmentNaval Postgraduate SchoolMontereyUSA

Personalised recommendations