Advanced Optimization Topics

Lange, Kenneth

doi:10.1007/978-1-4419-5945-4_16

Kenneth Lange²

Part of the book series: Statistics and Computing ((SCO))

8278 Accesses

Abstract

Our final chapter on optimization provides a concrete introduction to several advanced topics. The first vignette describes classical penalty and barrier methods for constrained optimization [22, 37, 45]. Penalty methods operate on the exterior and barrier methods on the interior of the feasible region. Fortunately, it is fairly easy to prove global convergence for both methods under reasonable hypotheses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Armstrong RD, Kung MT (1978) Algorithm AS 132: least absolute value estimates for a simple linear regression problem. Appl Stat 27:363-366
Article MATH Google Scholar
Boyle JP, Dykstra RL (1985) A method for finding projections onto the intersection of convex sets in Hilbert space. In Advances in Order Restricted Statistical Inference, Lecture Notes in Statistics, Springer, New York, 28-47
Google Scholar
Bregman LM (1965) The method of successive projection for finding a common point of convex sets. Soviet Math Doklady 6:688-692
MATH Google Scholar
Candes EJ, Tao T (2007) The Danzig selector: statistical estimation when p is much larger than n. Annals Stat 35:2313-2351
Article MATH MathSciNet Google Scholar
Candes EJ, Wakin M, Boyd S (2007) Enhancing sparsity by reweighted ℓ₁ minimization. J Fourier Anal Appl 14:877-905
Article MathSciNet Google Scholar
Censor Y, Reich S (1996) Iterations of paracontractions and firmly nonexpansive operators with applications to feasibility and optimization. Optimization 37:323-339
Article MATH MathSciNet Google Scholar
Censor Y, Zenios SA (1992) Proximal minimization with D-functions. J Optimization Theory Appl 73:451-464
Article MATH MathSciNet Google Scholar
Chen SS, Donoho DL, Saunders MA (1998) Atomic decomposition by basis pursuit. SIAM J Sci Comput 20:33-61
Article MathSciNet Google Scholar
Claerbout J, Muir F (1973) Robust modeling with erratic data. Geophysics 38:826-844
Article Google Scholar
Daubechies I, Defrise M, De Mol C (2004) An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm Pure Appl Math 57:1413-1457
Article MATH MathSciNet Google Scholar
de Leeuw J, Lange K (2007) Sharp quadratic majorization in one dimension.
Google Scholar
Deutsch F (2001) Best Approximation in Inner Product Spaces. Springer, New York
MATH Google Scholar
Donoho D, Johnstone I (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81:425-455
Article MATH MathSciNet Google Scholar
Dykstra RL (1983) An algorithm for restricted least squares estimation. J Amer Stat Assoc 78:837-842
Article MATH MathSciNet Google Scholar
Edgeworth FY (1887) On observations relating to several quantities. Hermathena 6:279-285
Google Scholar
Edgeworth FY (1888) On a new method of reducing observations relating to several quantities. Philosophical Magazine 25:184-191
Google Scholar
Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Annals Stat 32:407-499
Article MATH MathSciNet Google Scholar
Elsner L, Koltracht L, Neumann M (1992) Convergence of sequential and asynchronous nonlinear paracontractions. Numerische Mathematik 62:305-319
Article MATH MathSciNet Google Scholar
Fang S-C, Puthenpura S (1993) Linear Optimization and Extensions: Theory and Algorithms. Prentice-Hall, Englewood Cliffs, NJ
MATH Google Scholar
Fazel M, Hindi M, Boyd S (2003) Log-det heuristic for matrix rank minimization with applications to Hankel and Euclidean distance matrices. Proceedings American Control Conference 3:2156-2162
Google Scholar
Ferguson TS (1996) A Course in Large Sample Theory. Chapman & Hall, London
MATH Google Scholar
Forsgren A, Gill PE, Wright MH (2002) Interior point methods for nonlinear optimization. SIAM Review 44:523-597
Article MathSciNet Google Scholar
Friedman J, Hastie T, Tibshirani R (2007) Pathwise coordinate optimization. Ann Appl Stat 1:302-332
Article MATH MathSciNet Google Scholar
Friedman J, Hastie T, Tibshirani R (2009) Regularized paths for generalized linear models via coordinate descent. Technical Report, Stanford University Department of Statistics
Google Scholar
Fu WJ (1998) Penalized regressions: the bridge versus the lasso. J Comp Graph Stat 7:397-416
Article Google Scholar
Groenen PJF, Nalbantov G, Bioch JC (2007) Nonlinear support vector machines through iterative majorization and I-splines. Studies in Classification, Data Analysis, and Knowledge Organization, Lenz HJ, Decker R, Springer, Heidelberg-Berlin, pp 149-161
Google Scholar
Hestenes MR (1981) Optimization Theory: The Finite Dimensional Case. Robert E Krieger Publishing, Huntington, NY
Google Scholar
Hunter DR, Lange K (2004) A tutorial on MM algorithms. Amer Statistician 58:30-37
Article MathSciNet Google Scholar
Hunter DR, Li R (2005) Variable selection using MM algorithms. Annals Stat 33:1617-1642
Article MATH MathSciNet Google Scholar
Lange K (1994) An adaptive barrier method for convex programming. Methods Applications Analysis 1:392-402
MATH Google Scholar
Lange, K (2004) Optimization. Springer, New York
MATH Google Scholar
Lange K, Wu T (2007) An MM algorithm for multicategory vertex discriminant analysis. J Computational Graphical Stat 17:527-544
Article MathSciNet Google Scholar
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788-791
Article Google Scholar
Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems 13:556-562
Google Scholar
Levina E, Rothman A, Zhu J (2008) Sparse estimation of large covariance matrices via a nested lasso penalty. Ann Appl Stat 2:245-263
Article MATH MathSciNet Google Scholar
Li Y, Arce GR (2004) A maximum likelihood approach to least absolute deviation regression. EURASIP J Applied Signal Proc 2004:1762-1769
Article MathSciNet Google Scholar
Luenberger DG (1984) Linear and Nonlinear Programming, 2nd ed. Addison-Wesley, Reading, MA
MATH Google Scholar
Meng X-L, Rubin DB (1991) Using EM to obtain asymptotic variance-covariance matrices: the SEM algorithm, J Amer Stat Assoc 86: 899-909
Article Google Scholar
Michelot C (1986) A finite algorithm for finding the projection of a point onto the canonical simplex in Rⁿ. J Optimization Theory Applications 50:195-200
Article MATH MathSciNet Google Scholar
Park MY, Hastie T (2008) Penalized logistic regression for detecting gene interactions. Biostatistics 9:30-50
Article MATH Google Scholar
Pauca VP, Piper J, Plemmons RJ (2006) Nonnegative matrix factorization for spectral data analysis. Linear Algebra Applications 416:29-47
Article MATH MathSciNet Google Scholar
Portnoy S, Koenker R (1997) The Gaussian hare and the Laplacian tortoise: computability of squared-error versus absolute-error estimators. Stat Sci 12:279-300
MATH MathSciNet Google Scholar
Santosa F, Symes WW (1986) Linear inversion of band-limited reflection seimograms. SIAM J Sci Stat Comput 7:1307-1330
Article MATH MathSciNet Google Scholar
Silvey SD (1975) Statistical Inference. Chapman & Hall, London
MATH Google Scholar
Ruszczynski A (2006) Nonlinear Optimization. Princeton University Press, Princeton, NJ
MATH Google Scholar
Schölkopf B, Smola AJ (2002) Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA
Google Scholar
Taylor H, Banks SC, McCoy JF (1979) Deconvolution with the ℓ₁ norm. Geophysics 44:39-52
Article Google Scholar
Teboulle M (1992) Entropic proximal mappings with applications to nonlinear programming. Math Operations Research 17:670-690
Article MATH MathSciNet Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Roy Stat Soc, Series B 58:267-288
MATH MathSciNet Google Scholar
Vapnik V (1995) The Nature of Statistical Learning Theory. Springer, New York
MATH Google Scholar
Wang L, Gordon MD, Zhu J (2006) Regularized least absolute deviations regression and an efficient algorithm for parameter tuning. Proceedings of the Sixth International Conference on Data Mining (ICDM’06). IEEE Computer Society, pp 690-700
Google Scholar
Wang S, Yehya N, Schadt EE, Wang H, Drake TA, Lusis AJ (2006) Genetic and genomic analysis of a fat mass trait with complex inheritance reveals marked sex specificity. PLoS Genet 2:148-159
Article Google Scholar
Weston J, Elisseeff A, Schölkopf B, Tipping M (2003) Use of the zeronorm with linear models and kernel methods. J Machine Learning Research 3:1439-1461
Article MATH Google Scholar
Wu TT, Lange K (2008) Coordinate descent algorithms for lasso penalized regression. Ann Appl Stat 2:224-244
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Departments of Biomathematics, Human Genetics, and Statistics David Geffen School of Medicine, University of California, Los Angeles, Le Conte Ave. 10833, Los Angeles, CA, 90095-1766, USA
Kenneth Lange

Authors

Kenneth Lange
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kenneth Lange .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lange, K. (2010). Advanced Optimization Topics. In: Numerical Analysis for Statisticians. Statistics and Computing. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-5945-4_16

Download citation

DOI: https://doi.org/10.1007/978-1-4419-5945-4_16
Published: 19 April 2010
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-5944-7
Online ISBN: 978-1-4419-5945-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics