Skip to main content

Advanced Optimization Topics

  • Chapter
  • First Online:
Numerical Analysis for Statisticians

Part of the book series: Statistics and Computing ((SCO))

  • 8278 Accesses

Abstract

Our final chapter on optimization provides a concrete introduction to several advanced topics. The first vignette describes classical penalty and barrier methods for constrained optimization [22, 37, 45]. Penalty methods operate on the exterior and barrier methods on the interior of the feasible region. Fortunately, it is fairly easy to prove global convergence for both methods under reasonable hypotheses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Armstrong RD, Kung MT (1978) Algorithm AS 132: least absolute value estimates for a simple linear regression problem. Appl Stat 27:363-366

    Article  MATH  Google Scholar 

  2. Boyle JP, Dykstra RL (1985) A method for finding projections onto the intersection of convex sets in Hilbert space. In Advances in Order Restricted Statistical Inference, Lecture Notes in Statistics, Springer, New York, 28-47

    Google Scholar 

  3. Bregman LM (1965) The method of successive projection for finding a common point of convex sets. Soviet Math Doklady 6:688-692

    MATH  Google Scholar 

  4. Candes EJ, Tao T (2007) The Danzig selector: statistical estimation when p is much larger than n. Annals Stat 35:2313-2351

    Article  MATH  MathSciNet  Google Scholar 

  5. Candes EJ, Wakin M, Boyd S (2007) Enhancing sparsity by reweighted â„“1 minimization. J Fourier Anal Appl 14:877-905

    Article  MathSciNet  Google Scholar 

  6. Censor Y, Reich S (1996) Iterations of paracontractions and firmly nonexpansive operators with applications to feasibility and optimization. Optimization 37:323-339

    Article  MATH  MathSciNet  Google Scholar 

  7. Censor Y, Zenios SA (1992) Proximal minimization with D-functions. J Optimization Theory Appl 73:451-464

    Article  MATH  MathSciNet  Google Scholar 

  8. Chen SS, Donoho DL, Saunders MA (1998) Atomic decomposition by basis pursuit. SIAM J Sci Comput 20:33-61

    Article  MathSciNet  Google Scholar 

  9. Claerbout J, Muir F (1973) Robust modeling with erratic data. Geophysics 38:826-844

    Article  Google Scholar 

  10. Daubechies I, Defrise M, De Mol C (2004) An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm Pure Appl Math 57:1413-1457

    Article  MATH  MathSciNet  Google Scholar 

  11. de Leeuw J, Lange K (2007) Sharp quadratic majorization in one dimension.

    Google Scholar 

  12. Deutsch F (2001) Best Approximation in Inner Product Spaces. Springer, New York

    MATH  Google Scholar 

  13. Donoho D, Johnstone I (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81:425-455

    Article  MATH  MathSciNet  Google Scholar 

  14. Dykstra RL (1983) An algorithm for restricted least squares estimation. J Amer Stat Assoc 78:837-842

    Article  MATH  MathSciNet  Google Scholar 

  15. Edgeworth FY (1887) On observations relating to several quantities. Hermathena 6:279-285

    Google Scholar 

  16. Edgeworth FY (1888) On a new method of reducing observations relating to several quantities. Philosophical Magazine 25:184-191

    Google Scholar 

  17. Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Annals Stat 32:407-499

    Article  MATH  MathSciNet  Google Scholar 

  18. Elsner L, Koltracht L, Neumann M (1992) Convergence of sequential and asynchronous nonlinear paracontractions. Numerische Mathematik 62:305-319

    Article  MATH  MathSciNet  Google Scholar 

  19. Fang S-C, Puthenpura S (1993) Linear Optimization and Extensions: Theory and Algorithms. Prentice-Hall, Englewood Cliffs, NJ

    MATH  Google Scholar 

  20. Fazel M, Hindi M, Boyd S (2003) Log-det heuristic for matrix rank minimization with applications to Hankel and Euclidean distance matrices. Proceedings American Control Conference 3:2156-2162

    Google Scholar 

  21. Ferguson TS (1996) A Course in Large Sample Theory. Chapman & Hall, London

    MATH  Google Scholar 

  22. Forsgren A, Gill PE, Wright MH (2002) Interior point methods for nonlinear optimization. SIAM Review 44:523-597

    Article  MathSciNet  Google Scholar 

  23. Friedman J, Hastie T, Tibshirani R (2007) Pathwise coordinate optimization. Ann Appl Stat 1:302-332

    Article  MATH  MathSciNet  Google Scholar 

  24. Friedman J, Hastie T, Tibshirani R (2009) Regularized paths for generalized linear models via coordinate descent. Technical Report, Stanford University Department of Statistics

    Google Scholar 

  25. Fu WJ (1998) Penalized regressions: the bridge versus the lasso. J Comp Graph Stat 7:397-416

    Article  Google Scholar 

  26. Groenen PJF, Nalbantov G, Bioch JC (2007) Nonlinear support vector machines through iterative majorization and I-splines. Studies in Classification, Data Analysis, and Knowledge Organization, Lenz HJ, Decker R, Springer, Heidelberg-Berlin, pp 149-161

    Google Scholar 

  27. Hestenes MR (1981) Optimization Theory: The Finite Dimensional Case. Robert E Krieger Publishing, Huntington, NY

    Google Scholar 

  28. Hunter DR, Lange K (2004) A tutorial on MM algorithms. Amer Statistician 58:30-37

    Article  MathSciNet  Google Scholar 

  29. Hunter DR, Li R (2005) Variable selection using MM algorithms. Annals Stat 33:1617-1642

    Article  MATH  MathSciNet  Google Scholar 

  30. Lange K (1994) An adaptive barrier method for convex programming. Methods Applications Analysis 1:392-402

    MATH  Google Scholar 

  31. Lange, K (2004) Optimization. Springer, New York

    MATH  Google Scholar 

  32. Lange K, Wu T (2007) An MM algorithm for multicategory vertex discriminant analysis. J Computational Graphical Stat 17:527-544

    Article  MathSciNet  Google Scholar 

  33. Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788-791

    Article  Google Scholar 

  34. Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems 13:556-562

    Google Scholar 

  35. Levina E, Rothman A, Zhu J (2008) Sparse estimation of large covariance matrices via a nested lasso penalty. Ann Appl Stat 2:245-263

    Article  MATH  MathSciNet  Google Scholar 

  36. Li Y, Arce GR (2004) A maximum likelihood approach to least absolute deviation regression. EURASIP J Applied Signal Proc 2004:1762-1769

    Article  MathSciNet  Google Scholar 

  37. Luenberger DG (1984) Linear and Nonlinear Programming, 2nd ed. Addison-Wesley, Reading, MA

    MATH  Google Scholar 

  38. Meng X-L, Rubin DB (1991) Using EM to obtain asymptotic variance-covariance matrices: the SEM algorithm, J Amer Stat Assoc 86: 899-909

    Article  Google Scholar 

  39. Michelot C (1986) A finite algorithm for finding the projection of a point onto the canonical simplex in Rn. J Optimization Theory Applications 50:195-200

    Article  MATH  MathSciNet  Google Scholar 

  40. Park MY, Hastie T (2008) Penalized logistic regression for detecting gene interactions. Biostatistics 9:30-50

    Article  MATH  Google Scholar 

  41. Pauca VP, Piper J, Plemmons RJ (2006) Nonnegative matrix factorization for spectral data analysis. Linear Algebra Applications 416:29-47

    Article  MATH  MathSciNet  Google Scholar 

  42. Portnoy S, Koenker R (1997) The Gaussian hare and the Laplacian tortoise: computability of squared-error versus absolute-error estimators. Stat Sci 12:279-300

    MATH  MathSciNet  Google Scholar 

  43. Santosa F, Symes WW (1986) Linear inversion of band-limited reflection seimograms. SIAM J Sci Stat Comput 7:1307-1330

    Article  MATH  MathSciNet  Google Scholar 

  44. Silvey SD (1975) Statistical Inference. Chapman & Hall, London

    MATH  Google Scholar 

  45. Ruszczynski A (2006) Nonlinear Optimization. Princeton University Press, Princeton, NJ

    MATH  Google Scholar 

  46. Schölkopf B, Smola AJ (2002) Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA

    Google Scholar 

  47. Taylor H, Banks SC, McCoy JF (1979) Deconvolution with the â„“1 norm. Geophysics 44:39-52

    Article  Google Scholar 

  48. Teboulle M (1992) Entropic proximal mappings with applications to nonlinear programming. Math Operations Research 17:670-690

    Article  MATH  MathSciNet  Google Scholar 

  49. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Roy Stat Soc, Series B 58:267-288

    MATH  MathSciNet  Google Scholar 

  50. Vapnik V (1995) The Nature of Statistical Learning Theory. Springer, New York

    MATH  Google Scholar 

  51. Wang L, Gordon MD, Zhu J (2006) Regularized least absolute deviations regression and an efficient algorithm for parameter tuning. Proceedings of the Sixth International Conference on Data Mining (ICDM’06). IEEE Computer Society, pp 690-700

    Google Scholar 

  52. Wang S, Yehya N, Schadt EE, Wang H, Drake TA, Lusis AJ (2006) Genetic and genomic analysis of a fat mass trait with complex inheritance reveals marked sex specificity. PLoS Genet 2:148-159

    Article  Google Scholar 

  53. Weston J, Elisseeff A, Schölkopf B, Tipping M (2003) Use of the zeronorm with linear models and kernel methods. J Machine Learning Research 3:1439-1461

    Article  MATH  Google Scholar 

  54. Wu TT, Lange K (2008) Coordinate descent algorithms for lasso penalized regression. Ann Appl Stat 2:224-244

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kenneth Lange .

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer New York

About this chapter

Cite this chapter

Lange, K. (2010). Advanced Optimization Topics. In: Numerical Analysis for Statisticians. Statistics and Computing. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-5945-4_16

Download citation

Publish with us

Policies and ethics