Statistical Papers

, Volume 60, Issue 2, pp 565–584 | Cite as

An unexpected connection between Bayes A-optimal designs and the group lasso

  • Guillaume SagnolEmail author
  • Edouard Pauwels
Regular Article


We show that the A-optimal design optimization problem over m design points in \({\mathbb {R}}^n\) is equivalent to minimizing a quadratic function plus a group lasso sparsity inducing term over \(n\times m\) real matrices. This observation allows to describe several new algorithms for A-optimal design based on splitting and block coordinate decomposition. These techniques are well known and proved powerful to treat large scale problems in machine learning and signal processing communities. The proposed algorithms come with rigorous convergence guarantees and convergence rate estimate stemming from the optimization literature. Performances are illustrated on synthetic benchmarks and compared to existing methods for solving the optimal design problem.


A-optimal design Group Lasso Optimization First Order Methods 

Mathematics Subject Classification

62K05 90C25 



Both authors would like to thank two anonymous reviewers for their careful reading and detailed comments which helped improve the quality of this manuscript.


  1. Atkinson A, Donev A (1992) Optimum experimental designs. Oxford Statistical Science Series, vol 8. Oxford University Press, OxfordGoogle Scholar
  2. Bach F (2008) Consistency of the group lasso and multiple kernel learning. J Mach Learn Res 9:1179–1225 (Jun)MathSciNetzbMATHGoogle Scholar
  3. Bach F, Jenatton R, Mairal J, Obozinski G (2012) Optimization with sparsity-inducing penalties. Found Trends Mach Learn 4(1):1–106CrossRefzbMATHGoogle Scholar
  4. Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2(1):183–202MathSciNetCrossRefzbMATHGoogle Scholar
  5. Beck A, Pauwels E, Sabach S (2015) The cyclic block conditional gradient method for convex optimization problems. SIAM J Optim 25(4):2024–2049MathSciNetCrossRefzbMATHGoogle Scholar
  6. Bertsekas DP (1999) Nonlinear programming, 2nd edn. Athena Scientific, BelmontzbMATHGoogle Scholar
  7. Böhning D (1986) A vertex-exchange-method in D-optimal design theory. Metrika 33(1):337–347MathSciNetCrossRefzbMATHGoogle Scholar
  8. Borwein J, Lewis AS (2010) Convex analysis and nonlinear optimization: theory and examples. Springer, New YorkGoogle Scholar
  9. Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, CambridgeCrossRefzbMATHGoogle Scholar
  10. Braun G, Pokutta S, Zink D (2017) Lazifying conditional gradient algorithms. In: International conference on machine learning, pp 566–575Google Scholar
  11. Černỳ M, Hladík M (2012) Two complexity results on c-optimality in experimental design. Comput Optim Appl 51(3):1397–1408MathSciNetCrossRefzbMATHGoogle Scholar
  12. Chaloner K (1984) Optimal Bayesian experimental design for linear models. Ann Stat 12:283–300MathSciNetCrossRefzbMATHGoogle Scholar
  13. Chambolle A, Dossal C (2015) On the convergence of the iterates of “FISTA”. J Optim Theory Appl 166(3):25CrossRefzbMATHGoogle Scholar
  14. Combettes PL, Pesquet JC (2011) Proximal splitting methods in signal processing. In: Fixed-point algorithms for inverse problems in science and engineering, pp 185–212. Springer, New YorkGoogle Scholar
  15. Duncan G, DeGroot M (1976) A mean squared error approach to optimal design theory. In: Proceedings of the 1976 conference on Information: science and systems, pp 217–221. The John Hopkins University, BaltimoreGoogle Scholar
  16. Fedorov V (1972) Theory of optimal experiments. Academic Press, New York. Translated and edited by W. J. Studden and E.M. KlimkoGoogle Scholar
  17. Fedorov V (1996) Design of spatial experiments: model fitting and prediction. In: Gosh S, Rao C (eds) Handbook of statistics, Chap. 16, vol 13. Elsevier, New York, pp 515–553Google Scholar
  18. Fedorov V, Lee J (2000) Design of experiments in statistics. In: Wolkowicz H, Saigal R, Vandenberghe L (eds) Handbook of semidefinite programming. Kluwer, BostonGoogle Scholar
  19. Frank M, Wolfe P (1956) An algorithm for quadratic programming. Naval Res Logist Q 3(1–2):95–110MathSciNetCrossRefGoogle Scholar
  20. Gauthier B, Pronzato L (2016) Optimal design for prediction in random field models via covariance kernel expansions. In: mODa 11-advances in model-oriented design and analysis, pp 103–111. SpringerGoogle Scholar
  21. Gauthier B, Pronzato L (2017) Convex relaxation for imse optimal design in random-field models. Comput Stat Data Anal 113:375–394MathSciNetCrossRefzbMATHGoogle Scholar
  22. Harman R, Filová L, Richtárik P (2018) A randomized exchange algorithm for computing optimal approximate designs of experiments . arXiv:1801.05661
  23. Kerdreux T, Pedregosa F, d’Aspremont A (2018) Frank-Wolfe with subsampling oracle. arXiv:1803.07348
  24. Lukemire J, Mandal A, Wong W (2018) D-QPSO: a quantum-behaved particle swarm technique for finding d-optimal designs with discrete and continuous factors and a binary response. In: Technometrics, pp 1–27Google Scholar
  25. Nesterov Y (1983) A method of solving a convex programming problem with convergence rate O (1/k2). Sov Math Dokl 27:372–376zbMATHGoogle Scholar
  26. Pilz J (1983) Bayesian estimation and experimental design in linear regression models, vol 55. Teubner-Texte zur Mathematik, LeipzigzbMATHGoogle Scholar
  27. Pronzato L (2013) A delimitation of the support of optimal designs for kiefer’s \(\phi \)p-class of criteria. Stat Probab Lett 83(12):2721–2728MathSciNetCrossRefzbMATHGoogle Scholar
  28. Pukelsheim F (1993) Optimal design of experiments. Wiley, New YorkzbMATHGoogle Scholar
  29. Pukelsheim F, Rieder S (1992) Efficient rounding of approximate designs. Biometrika 79:763–770MathSciNetCrossRefGoogle Scholar
  30. Rockafellar RT (1970) Convex analysis. Princeton Mathematical Series, vol 28. Princeton University Press, PrincetonGoogle Scholar
  31. Rockafellar RT, Wets RJB (2009) Variational analysis, vol 317. Springer, New YorkzbMATHGoogle Scholar
  32. Sagnol G (2011) Computing optimal designs of multiresponse experiments reduces to second-order cone programming. J Stat Plan Inference 141(5):1684–1708MathSciNetCrossRefzbMATHGoogle Scholar
  33. Sagnol G, Harman R (2015) Computing exact D-optimal designs by mixed integer second-order cone programming. Ann Stat 43(5):2198–2224MathSciNetCrossRefzbMATHGoogle Scholar
  34. Sagnol G, Hege H.C, Weiser M (2016) Using sparse kernels to design computer experiments with tunable precision. In: Proceedings of the 22nd international conference on computational statistics, pp 397–408Google Scholar
  35. Silvey S, Titterington D, Torsney B (1978) An algorithm for optimal designs on a finite design space. Commun Stat Theory Methods 7(14):1379–1389CrossRefzbMATHGoogle Scholar
  36. Spöck G, Pilz J (2010) Spatial sampling design and covariance-robust minimax prediction based on convex design ideas. Stoch Environ Res Risk Assess 24(3):463–482CrossRefzbMATHGoogle Scholar
  37. Tanaka K, Miyakawa M (2013) The group lasso for design of experiments. arXiv:1308.1196
  38. Wright SJ (2015) Coordinate descent algorithms. Math Progr 151(1):3–34MathSciNetCrossRefzbMATHGoogle Scholar
  39. Wynn H (1970) The sequential generation of \(D\)-optimum experimental designs. Ann Math Stat 41:1655–1664MathSciNetCrossRefzbMATHGoogle Scholar
  40. Yu Y (2010) Monotonic convergence of a general algorithm for computing optimal designs. Ann Stat 38(3):1593–1606MathSciNetCrossRefzbMATHGoogle Scholar
  41. Yu Y (2011) D-optimal designs via a cocktail algorithm. Stat Comput 21(4):475–481MathSciNetCrossRefzbMATHGoogle Scholar
  42. Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc B 68(1):49–67MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Institut für MathematikTechnische Universität BerlinBerlinGermany
  2. 2.Toulouse 3 Université Paul SabatierToulouseFrance

Personalised recommendations