Skip to main content

Newton’s Method and Scoring

  • Chapter
  • First Online:
Optimization

Part of the book series: Springer Texts in Statistics ((STS,volume 95))

  • 12k Accesses

Abstract

Block relaxation and the MM algorithm are hardly the only methods of optimization. Newton’s method is better known and more widely applied. Despite its defects, Newton’s method is the gold standard for speed of convergence and forms the basis of most modern optimization algorithms in low dimensions. Its many variants seek to retain its fast convergence while taming its defects. The variants all revolve around the core idea of locally approximating the objective function by a strictly convex quadratic function. At each iteration the quadratic approximation is optimized. Safeguards are introduced to keep the iterates from veering toward irrelevant stationary points.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bradley EL (1973) The equivalence of maximum likelihood and weighted least squares estimates in the exponential family. J Am Stat Assoc 68:199–200

    MATH  Google Scholar 

  2. Brent RP (1973) Some efficient algorithms for solving systems of nonlinear equations. SIAM J Numer Anal 10:327–344

    Article  MathSciNet  MATH  Google Scholar 

  3. Broyden CG (1965) A class of methods for solving nonlinear simultaneous equations. Math Comput 19:577–593

    Article  MathSciNet  MATH  Google Scholar 

  4. Charnes A, Frome EL, Yu PL (1976) The equivalence of generalized least squares and maximum likelihood in the exponential family. J Am Stat Assoc 71:169–171

    Article  MathSciNet  MATH  Google Scholar 

  5. Choi SC, Wette R (1969) Maximum likelihood estimation of the parameters of the gamma distribution and their bias. Technometrics 11:683–690

    Article  MATH  Google Scholar 

  6. Cox DR (1970) Analysis of binary data. Methuen, London

    MATH  Google Scholar 

  7. de Leeuw J, Heiser WJ (1980) Multidimensional scaling with restrictions on the configuration. In: Krishnaiah PR (ed) Multivariate analysis, vol V. North-Holland, Amsterdam, pp 501–522

    Google Scholar 

  8. Dobson AJ (1990) An introduction to generalized linear models. Chapman & Hall, London

    MATH  Google Scholar 

  9. Golub GH, Van Loan CF (1996) Matrix computations, 3rd edn. Johns Hopkins University Press, Baltimore

    MATH  Google Scholar 

  10. Green PJ (1984) Iteratively reweighted least squares for maximum likelihood estimation and some robust and resistant alternatives (with discussion). J Roy Stat Soc B 46:149–192

    MATH  Google Scholar 

  11. Householder AS (1975) The theory of matrices in numerical analysis. Dover, New York

    MATH  Google Scholar 

  12. Jamshidian M, Jennrich RI (1995) Acceleration of the EM algorithm by using quasi-Newton methods. J Roy Stat Soc B 59:569–587

    Article  MathSciNet  Google Scholar 

  13. Jamshidian M, Jennrich RI (1997) Quasi-Newton acceleration of the EM algorithm. J Roy Stat Soc B 59:569–587

    Article  MathSciNet  MATH  Google Scholar 

  14. Jennrich RI, Moore RH (1975) Maximum likelihood estimation by means of nonlinear least squares. In: Proceedings of the statistical computing section. American Statistical Association, Washington, DC, pp 57–65

    Google Scholar 

  15. Kingman JFC (1993) Poisson processes. Oxford University Press, Oxford

    MATH  Google Scholar 

  16. Lange K (1995) A gradient algorithm locally equivalent to the EM algorithm. J Roy Stat Soc B 57:425–437

    MATH  Google Scholar 

  17. Lange K (1995) A quasi-Newton acceleration of the EM algorithm. Stat Sin 5:1–18

    MATH  Google Scholar 

  18. Lehmann EL (1986) Testing statistical hypotheses, 2nd edn. Wiley, Hoboken

    Book  MATH  Google Scholar 

  19. Narayanan A (1991) Algorithm AS 266: maximum likelihood estimation of the parameters of the Dirichlet distribution. Appl Stat 40:365–374

    Article  Google Scholar 

  20. Nelder JA, Wedderburn RWM (1972) Generalized linear models. J Roy Stat Soc A 135:370–384

    Article  Google Scholar 

  21. Rao CR (1973) Linear statistical inference and its applications, 2nd edn. Wiley, Hoboken

    Book  MATH  Google Scholar 

  22. Stoer J, Bulirsch R (2002) Introduction to numerical analysis, 3rd edn. Springer, New York

    MATH  Google Scholar 

  23. Titterington DM, Smith AFM, Makov UE (1985) Statistical analysis of finite mixture distributions. Wiley, Hoboken

    MATH  Google Scholar 

  24. Whyte BM, Gold J, Dobson AJ, Cooper DA (1987) Epidemiology of acquired immunodeficiency syndrome in Australia. Med J Aust 147:65–69

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Lange, K. (2013). Newton’s Method and Scoring. In: Optimization. Springer Texts in Statistics, vol 95. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-5838-8_10

Download citation

Publish with us

Policies and ethics