Optimization pp 221-244 | Cite as

The EM Algorithm

  • Kenneth Lange
Part of the Springer Texts in Statistics book series (STS, volume 95)


Maximum likelihood is the dominant form of estimation in applied statistics. Because closed-form solutions to likelihood equations are the exception rather than the rule, numerical methods for finding maximum likelihood estimates are of paramount importance. In this chapter we study maximum likelihood estimation by the EM algorithm a special case of the MM algorithm. At the heart of every EM algorithm is some notion of missing data. Data can be missing in the ordinary sense of a failure to record certain observations on certain cases. Data can also be missing in a theoretical sense. We can think of the E (expectation) step of the algorithm as filling in the missing data. This action replaces the loglikelihood of the observed data by a minorizing function. This surrogate function is then maximized in the M step. Because the surrogate function is usually much simpler than the likelihood, we can often solve the M step analytically. The price we pay for this simplification is that the EM algorithm is iterative. Reconstructing the missing data is bound to be slightly wrong if the parameters do not already equal their maximum likelihood estimates.


Random Vector Success Probability Conditional Expectation Conditional Density Multivariate Normal Distribution 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 8.
    Baum LE (1972) An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. Inequalities 3:1–8Google Scholar
  2. 65.
    Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J Roy Stat Soc B 39:1–38MathSciNetMATHGoogle Scholar
  3. 71.
    Devijver PA (1985) Baum’s forward-backward algorithm revisited. Pattern Recogn Lett 3:369–373MATHCrossRefGoogle Scholar
  4. 73.
    Dobson AJ (1990) An introduction to generalized linear models. Chapman & Hall, LondonMATHGoogle Scholar
  5. 76.
    Duan J-C, Simonato J-G (1993) Multiplicity of solutions in maximum likelihood factor analysis. J Stat Comput Simul 47:37–47CrossRefGoogle Scholar
  6. 78.
    Durbin R, Eddy S, Krogh A, Mitchison G (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press, CambridgeMATHCrossRefGoogle Scholar
  7. 95.
    Flury B, Zoppè A (2000) Exercises in EM. Am Stat 54:207–209Google Scholar
  8. 112.
    Green PJ (1990) On use of the EM algorithm for penalized likelihood estimation. J Roy Stat Soc B 52:443–452MATHGoogle Scholar
  9. 165.
    Lange K (2002) Mathematical and statistical methods for genetic analysis, 2nd edn. Springer, New YorkMATHCrossRefGoogle Scholar
  10. 166.
    Lange K (2010) Numerical analysis for statisticians, 2nd edn. Springer, New YorkMATHCrossRefGoogle Scholar
  11. 179.
    Little RJA, Rubin DB (1987) Statistical analysis with missing data. Wiley, HobokenMATHGoogle Scholar
  12. 191.
    McLachlan GJ, Krishnan T (2008) The EM algorithm and extensions, 2nd edn. Wiley, HobokenMATHCrossRefGoogle Scholar
  13. 192.
    McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, HobokenMATHCrossRefGoogle Scholar
  14. 216.
    Rabiner L (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77:257–285CrossRefGoogle Scholar
  15. 218.
    Rao CR (1973) Linear statistical inference and its applications, 2nd edn. Wiley, HobokenMATHCrossRefGoogle Scholar
  16. 251.
    Tanner MA (1993) Tools for statistical inference: methods for the exploration of posterior distributions and likelihood functions, 2nd edn. Springer, New YorkMATHGoogle Scholar
  17. 259.
    Titterington DM, Smith AFM, Makov UE (1985) Statistical analysis of finite mixture distributions. Wiley, HobokenMATHGoogle Scholar
  18. 270.
    Weeks DE, Lange K (1989) Trials, tribulations, and triumphs of the EM algorithm in pedigree analysis. IMA J Math Appl Med Biol 6:209–232MathSciNetMATHCrossRefGoogle Scholar
  19. 281.
    Zhou H, Lange K (2009) On the bumpy road to the dominant mode. Scand J Stat 37:612–631MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Kenneth Lange
    • 1
  1. 1.Biomathematics, Human Genetics, StatisticsUniversity of CaliforniaLos AngelesUSA

Personalised recommendations