Monte Carlo Optimization

  • Christian P. Robert
  • George Casella
Part of the Springer Texts in Statistics book series (STS)


This chapter is the equivalent for optimization problems of what Chapter 3 is for integration problems. Here we distinguish between two separate uses of computer generated random variables. The first use, as seen in Section 5.2, is to produce stochastic techniques to reach the maximum (or minimum) of a function, devising random explorations techniques on the surface of this function that avoid being trapped in a local maximum (or minimum) but also that are sufficiently attracted by the global maximum (or minimum). The second use, described in Section 5.3, is closer to Chapter 3 in that it approximates the function to be optimized. The most popular algorithm in this perspective is the EM (Expectation-Maximization) algorithm.


Simulated Annealing Stochastic Approximation Exponential Family Simulated Annealing Method Monte Carlo Optimization 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Broniatowski, M., Celeux, G., and Diebolt, J. (1984). Reconnaissance de mélanges de densités par un algorithme d’apprentissage probabiliste. In Diday, E., editor, Data Analysis and Informatics, volume 3, pages 359–373. North-Holland, Amsterdam.Google Scholar
  2. Celeux, G. and Clairambault, J. (1992). Estimation de chaînes de Markov cachées: méthodes et problèmes. In Approches Markoviennes en Signal et Images, pages 5–19, CNRS, Paris. GDR CNRS Traitement du Signal et Images.Google Scholar
  3. Celeux, G. and Diebolt, J. (1985). The SEM algorithm• a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Comput. Statist. Quater., 2: 73–82.Google Scholar
  4. Diebolt, J. and Ip, E. (1996). Stochastic EM: method and application. In Gilks, W., Richardson, S., and Spiegelhalter, D., editors, Markov chain Monte Carlo in Practice, pages 259–274. Chapman and Hall, New York.Google Scholar
  5. Celeux, G. and Diebolt, J. (1990). Une version de type recuit simulé de l’algorithme EM. Comptes Rendus Acad. Sciences Paris, 310: 119–124.MathSciNetzbMATHGoogle Scholar
  6. Celeux, G., Chauveau, D., and Diebolt, J. (1996). Stochastic versions of the EM algorithm: An experimental study in the mixture case. J. Statist. Comput. Simul., 55 (4): 287–314.zbMATHCrossRefGoogle Scholar
  7. Lavielle, M. and Moulines, E. (1997). On a stochastic approximation version of the EM algorithm. Statist. Comput., 7: 229–236.CrossRefGoogle Scholar
  8. Meng, X. and Rubin, D. (1991). Using EM to obtain asymptotic variance-covariance matrices. J. American Statist. Assoc., 86: 899–909.CrossRefGoogle Scholar
  9. Meng, X. and Rubin, D. (1992). Maximum likelihood estimation via the ECM algorithm. A general framework. Biometrika, 80: 267–278.MathSciNetCrossRefGoogle Scholar
  10. Liu, C. and Rubin, D. (1994). The ECME algorithm: a simple extension of EM and ECM with faster monotonous convergence. Biometrika, 81: 633–648.MathSciNetzbMATHCrossRefGoogle Scholar
  11. Meng, X. and van Dyk, D. (1997). The EM algorithm an old folk-song sung to a new tune (with discussion). J. Royal Statist. Soc. Series B, 59: 511–568.zbMATHCrossRefGoogle Scholar
  12. Neal, R. (1999). Bayesian Learning for Neural Networks, volume 118. Springer-Verlag, New York. Lecture Notes.Google Scholar
  13. Ripley, B. (1994). Neural networks and related methods for classification (with discussion). J. Royal Statist. Soc. Series B, 56: 409–4560.MathSciNetzbMATHGoogle Scholar
  14. Ripley, B. (1996). Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge.zbMATHGoogle Scholar
  15. Le Cun, Y., Boser, D., Denker, J., Henderson, D., Howard, R., Hubbard, W., and Jackel, L. (1989). Handwritten digit recognition with a backpropagation network. In Touresky, D., editor, Advances in Neural Information Processing Systems II, pages 396–404. Morgan Kaufman, San Mateo, CA.Google Scholar
  16. Robbins, H. and Monro, S. (1951). A stochastic approximation method. Ann. Mathemat. Statist., 22: 400–407.MathSciNetzbMATHCrossRefGoogle Scholar
  17. Kiefer, J. and Wolfowitz, J. (1952). Stochastic estimation of the maximum of a regression function. Ann. Mathemat. Statist., 23: 462–466.MathSciNetzbMATHCrossRefGoogle Scholar
  18. Bouleau, N. and Lépingle, D. (1994). Numerical Methods for Stochastic Processes. John Wiley, New York.zbMATHGoogle Scholar
  19. Benveniste, A., Métivier, M., and Priouret, P. (1990). Adaptive Algorithms and Stochastic Approximations. Springer-Verlag, New York, Berlin-Heidelberg.Google Scholar
  20. Wasan, M. (1969). Stochastic Approximation. Cambridge University Press, Cambridge.zbMATHGoogle Scholar
  21. Kersting, G. (1987). Some results on the asymptotic behavior of the RobbinsMonro process. Bull. Int. Statis. Inst., 47: 327–335.MathSciNetGoogle Scholar
  22. Duflo, M. (1996). Random iterative models. In Karatzas, I. and Yor, M., editors, Applications of Mathematics, volume 34. Springer-Verlag, Berlin.Google Scholar
  23. Hwang, C. (1980). Laplace’s method revisited: Weak convergence of probability measures. Ann. Probab., 8: 1177–1182.MathSciNetzbMATHCrossRefGoogle Scholar
  24. Geyer, C. (1996). Estimation and optimization of functions. In Gilks, W., Richardson, S., and Spiegelhalter, D., editors, Markov chain Monte Carlo in Practice, pages 241–258. Chapman and Hall, New York.Google Scholar
  25. Geyer, C. and Thompson, E. (1992). Constrained Monte Carlo maximum likelihood for dependent data (with discussion). J. Royal Statist. Soc. Series B, 54: 657–699.MathSciNetGoogle Scholar
  26. Geyer, C. and Thompson, E. (1995). Annealing Markov chain Monte Carlo with applications to ancestral inference. J. American Statist. Assoc., 90: 909–920.zbMATHCrossRefGoogle Scholar
  27. Geyer, C. (1993). Estimating normalizing constants and reweighting mixtures in Markov chain Monte Carlo. Technical Report 568, School of Statistics, Univ. of Minnesota.Google Scholar
  28. Geyer, C. (1994). On the convergence of Monte Carlo maximum likelihood calculations. J. R. Statist. Soc. B, 56: 261–274.MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2004

Authors and Affiliations

  • Christian P. Robert
    • 1
  • George Casella
    • 2
  1. 1.CEREMADEUniversité Paris DauphineParis Cedex 16France
  2. 2.Department of StatisticsUniversity of FloridaGainesvilleUSA

Personalised recommendations