Skip to main content

Bayesian Computational Methods

  • Chapter
  • First Online:
Handbook of Computational Statistics

Part of the book series: Springer Handbooks of Computational Statistics ((SHCS))

Abstract

If, in the mid 1980s, one had asked the average statistician about the difficulties of using Bayesian Statistics, the most likely answer would have been “Well, there is this problem of selecting a prior distribution and then, even if one agrees on the prior, the whole Bayesian inference is simply impossible to implement in practice!” The same question asked in the Twenty first Century does not produce the same reply, but rather a much less aggressive complaint about the lack of generic software (besides winBUGS), along with the renewed worry of subjectively selecting a prior! The last 20 years have indeed witnessed a tremendous change in the way Bayesian Statistics are perceived, both by mathematical statisticians and by applied statisticians and the impetus behind this change has been a prodigious leap-forward in the computational abilities. The availability of very powerful approximation methods has correlatively freed Bayesian modelling, in terms of both model scope and prior modelling. This opening has induced many more scientists from outside the statistics community to opt for a Bayesian perspective as they can now handle those tools on their own. As discussed below, a most successful illustration of this gained freedom can be seen in Bayesian model choice, which was only emerging at the beginning of the MCMC era, for lack of appropriate computational tools.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In this chapter, the denomination universal is used in the sense of uniformly over all distributions.

  2. 2.

    To impose the stationarity constraint when the order of the AR(p) model varies, it is necessary to reparameterise this model in terms of either the partial autocorrelations or of the roots of the associated lag polynomial. (See, e.g., Robert 2007, Sect. 4.5.)

  3. 3.

    In this presentation of Bayes factors, we completely bypass the methodological difficulty of defining π(θ ∈ Θ 0) when Θ 0 is of measure 0 for the original prior π and refer the reader to Robert (2007, Sect. 5.2.3) and Marin and Robert (2007, Sect. 2.3.2) for proper coverage of this issue.

  4. 4.

    The prior distribution can be used for importance sampling only if it is a proper prior and not a σ-finite measure.

  5. 5.

    The constant order of the Monte Carlo error does not imply that the computational effort remains the same as the dimension increases, most obviously, but rather that the decrease (with m) in variation has the rate \(1/\sqrt{m}\).

  6. 6.

    The empirical (Monte Carlo) confidence interval is not to be confused with the asymptotic confidence interval derived from the normal approximation. As discussed in Robert and Casella (2004, Chap. 4), these two intervals may differ considerably in width, with the interval derived from the CLT being much more optimistic!

  7. 7.

    An alternative to the simulation from one \(\mathcal{T} (\nu,{x}_{i}, 1)\) distribution that does not require an extensive study on the most appropriate x i is to use a mixture of the \(\mathcal{T} (\nu,{x}_{i}, 1)\) distributions. As seen in Sect. 26.5.2, the weights of this mixture can even be optimised automatically.

  8. 8.

    We stress the point that this is mostly an academic exercise as, in regular settings, it is rarely the case that independent components are used for the importance function.

  9. 9.

    Sect. 26.4.3 covers in greater details the setting of varying dimension problems, with the same theme that completion distributions and parameters are necessary but influential for the performances of the approximation.

  10. 10.

    Even in the simple case of the probit model, MCMC algorithms do not always converge very quickly, as shown in Robert and Casella (2004, Chap. 14).

  11. 11.

    It is quite interesting to see that the mixture Gibbs sampler suffers from the same pathology as the EM algorithm, although this is not surprising given that it is based on the same completion scheme.

  12. 12.

    This wealth of possible alternatives to the completion Gibbs sampler is a mixed blessing in that their range, for instance the scale of the random walk proposals, needs to be scaled properly to avoid inefficiencies.

  13. 13.

    The difficulty with the infinite part of the problem is easily solved in that the setting is identical to simulation problems in (countable or uncountable) infinite spaces. When running simulations in those spaces, some values are never visited by the simulated Markov chain and the chances a value is visited is related with the probability of this value under the target distribution.

  14. 14.

    Early proposals to solve the varying dimension problem involved saturation schemes where all the parameters for all models were updated deterministically (Carlin and Chib 1995), but they do not apply for an infinite collection of models and they need to be precisely calibrated to achieve a sufficient amount of moves between models.

  15. 15.

    For a simple proof that the acceptance probability guarantees that the stationary distribution is π(k, θ(k)), see Robert and Casella (2004, Sect. 11.2.2).

  16. 16.

    In the birth acceptance probability, the factorials k! and (k + 1)! appear as the numbers of ways of ordering the k and k + 1 components of the mixtures. The ratio cancels with \(1/(k + 1)\), which is the probability of selecting a particular component for the death step.

  17. 17.

    The “sequential” denomination in the sequential Monte Carlo methods thus refers to the algorithmic part, not to the statistical part.

  18. 18.

    The generic Rao–Blackwellised improvement was introduced in the original MCMC paper of Gelfand and Smith (1990) and studied by Liu et al. (1994) and Casella and Robert (1996). More recent developments are proposed in Cornuet et al. (2009), in connection with adaptive algorithms like PMC.

  19. 19.

    Using a Gaussian non-parametric kernel estimator amounts to (a) sampling from the x i (t)’s with equal weights and (b) using a normal random walk move from the selected x i (t), with standard deviation equal to the bandwidth of the kernel.

  20. 20.

    When the survival rate of a proposal distribution is null, in order to avoid the complete removal of a given scale v k , the corresponding number r k of proposals with that scale is set to a positive value, like 1% of the sample size.

  21. 21.

    An R package called mcsm has been developed in association with Robert and Casella (2009) for training about Monte Carlo methods.

References

  • Abowd, J., Kramarz, F., Margolis, D.: High-wage workers and high-wage firms. Econometrica 67, 251–333 (1999)

    Google Scholar 

  • Albert, J.: Bayesian Computation Using Minitab. Wadsworth Publishing Company (1996)

    Google Scholar 

  • Albert, J.H.: Bayesian Computation with R. Springer, New York, (2007)

    Book  MATH  Google Scholar 

  • Andrieu, C., Robert, C.P.: Controlled Markov chain Monte Carlo methods for optimal sampling. Technical Report 0125, Université Paris Dauphine (2001)

    Google Scholar 

  • Andrieu, C., Doucet, A., Robert, C.P.: Computational advances for and from Bayesian analysis. Stat. Sci. 19(1), 118–127 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  • Bauwens, L., Richard, J.F.: A 1-1 Poly-t random variable generator with application to Monte Carlo integration. J. Econometrics 29, 19–46 (1985)

    Article  MATH  Google Scholar 

  • Beaumont, M.A., Zhang, W., Balding, D.J.: Approximate Bayesian computation in population genetics. Genetics 162, 2025–2035 (2002)

    Google Scholar 

  • Beaumont, M.A., Cornuet, J.-M., Marin, J.-M., Robert, C.P.: Adaptive approximate Bayesian computation. Biometrika 96(4), 983–990 (2009)

    MathSciNet  MATH  Google Scholar 

  • Berkhof, J., van Mechelen, I., Gelman, A.: A Bayesian approach to the selection and testing of mixture models. Statistica Sinica 13, 423–442 (2003)

    MathSciNet  MATH  Google Scholar 

  • Blum, M.G.B., François, O.: Non-linear regression models for approximate Bayesian computation. Stat. Comput. 20, 63–73 (2010)

    Article  MathSciNet  Google Scholar 

  • Bortot, P., Coles, S.G., Sisson, S.A.: Inference for stereological extremes. J. Am. Stat. Assoc. 102, 84–92 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Cappé, O., Robert, C.P.: MCMC: Ten years and still running! J. Am. Stat. Assoc. 95(4), 1282–1286 (2000)

    MATH  Google Scholar 

  • Cappé, O., Guillin, A., Marin, J.-M., Robert, C.P.: Population Monte Carlo. J. Comput. Graph. Stat. 13(4), 907–929 (2004)

    Article  Google Scholar 

  • Cappé, O., Moulines, E., Rydén, T.: Inference in Hidden Markov Models. Springer, New York (2005)

    MATH  Google Scholar 

  • Cappé, O., Douc, R., Guillin, A., Marin, J.-M., Robert, C.P.: Adaptive importance sampling in general mixture classes. Stat. Comput. 18, 447–459 (2008)

    Article  MathSciNet  Google Scholar 

  • Carlin, B.P., Chib, S.: Bayesian model choice through Markov chain Monte Carlo. J. Roy. Stat. Soc. B. 57(3), 473–484 (1995)

    MATH  Google Scholar 

  • Casella, G., Robert, C.P.: Rao-Blackwellisation of sampling schemes. Biometrika 83(1), 81–94 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  • Celeux, G., Hurn, M.A., Robert, C.P.: Computational and inferential difficulties with mixtures posterior distribution. J. Am. Stat. Assoc. 95(3), 957–979 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  • Chen, M.H., Shao, Q.M., Ibrahim, J.G.: Monte Carlo Methods in Bayesian Computation. Springer, New York (2000)

    MATH  Google Scholar 

  • Chib, S.: Marginal likelihood from the Gibbs output. J. Am. Stat. Assoc. 90, 1313–1321 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  • Chopin, N.: Inference and model choice for time-ordered hidden Markov models. J. Roy. Stat. Soc. B. 69(2), 269–284 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Chopin, N., Robert, Cp.P.: Properties of nested sampling. Biometrika (2010); To appear, see arXiv:0801.3887. 97, 741–755

    Google Scholar 

  • Cornuet, J.-M., Marin, J.-M., Mira, A., Robert, C.P.: Adaptive multiple importance sampling. Technical Report arXiv.org:0907.1254, CEREMADE, Université Paris, Dauphine (2009)

    Google Scholar 

  • Del Moral, P., Doucet, A., Jasra, A.: Sequential Monte Carlo samplers. J. Roy. Stat. Soc. B. 68(3), 411–436 (2006)

    Article  MATH  Google Scholar 

  • Dickey, J.M.: The weighted likelihood ratio, linear hypotheses on normal location parameters. Ann. Math. Stat. 42, 204–223 (1971)

    Article  MathSciNet  MATH  Google Scholar 

  • Diebolt, J., Robert, C.P.: Estimation of finite mixture distributions by Bayesian sampling. J. Roy. Stat. Soc. B. 56, 363–375 (1994)

    MathSciNet  MATH  Google Scholar 

  • Doornik, J.A., Hendry, D.F., Shephard, N.: Computationally-intensive econometrics using a distributed matrix-programming language. Philo. Trans. Roy. Soc. London 360, 1245–1266 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  • Douc, R., Guillin, A., Marin, J.-M., Robert, C.P.: Convergence of adaptive mixtures of importance sampling schemes. Ann. Stat. 35(1), 420–448 (2007a)

    Article  MathSciNet  MATH  Google Scholar 

  • Douc, R., Guillin, A., Marin, J.-M., Robert, C.P.: Minimum variance importance sampling via population Monte Carlo. ESAIM: Probab. Stat. 11, 427–447 (2007b)

    Article  MathSciNet  MATH  Google Scholar 

  • Doucet, A., de Freitas, N., Gordon, N.: Sequential Monte Carlo Methods in Practice. Springer, New York (2001)

    MATH  Google Scholar 

  • Frühwirth-Schnatter, S.: Markov chain Monte Carlo estimation of classical and dynamic switching and mixture models. J. Am. Stat. Assoc. 96(453), 194–209 (2001)

    Article  MATH  Google Scholar 

  • Frühwirth-Schnatter, S.: Estimating marginal likelihoods for mixture and Markov switching models using bridge sampling techniques. Econometrics J. 7(1), 143–167 (2004)

    Article  MATH  Google Scholar 

  • Frühwirth-Schnatter, S.: Finite Mixture and Markov Switching Models. Springer, New York (2006)

    MATH  Google Scholar 

  • Gelfand, A.E., Smith, A.F.M.: Sampling based approaches to calculating marginal densities. J. Am. Stat. Assoc. 85, 398–409 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  • Gelman, A., Gilks, W.R., Roberts, G.O.: Efficient Metropolis jumping rules. In: Berger, J.O., Bernardo, J.M., Dawid, A.P., Lindley, D.V., Smith, A.F.M. (eds.) Bayesian Statistics 5, pp. 599–608. Oxford University Press, Oxford (1996)

    Google Scholar 

  • Geweke, J.: Using simulation methods for Bayesian econometric models: Inference, development, and communication (with discussion and rejoinder). Economet. Rev. 18, 1–126 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  • Geweke, J.: Interpretation and inference in mixture models: Simple MCMC works. Comput. Stat. Data Anal. 51(7), 3529–3550 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Gilks, W.R., Berzuini, C.: Following a moving target–Monte Carlo inference for dynamic Bayesian models. J. Roy. Stat. Soc. B. 63(1), 127–146 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Gilks, W.R., Thomas, A., Spiegelhalter, D.J.: A language and program for complex Bayesian modelling. The Statistician 43, 169–178 (1994)

    Article  Google Scholar 

  • Gilks, W.R., Roberts, G.O., Sahu, S.K.: Adaptive Markov chain Monte Carlo. J. Am. Stat. Assoc. 93, 1045–1054 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  • Gordon, N., Salmond, J., Smith, A.F.M.: A novel approach to non-linear/non-Gaussian Bayesian state estimation. IEEE Proceedings on Radar and Signal Processing 140, 107–113 (1993)

    Article  Google Scholar 

  • Green, P.J.: Reversible jump MCMC computation and Bayesian model determination. Biometrika 82(4), 711–732 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  • Haario, H., Saksman, E., Tamminen, J.: Adaptive proposal distribution for random walk Metropolis algorithm. Comput. Stat. 14(3), 375–395 (1999)

    Article  MATH  Google Scholar 

  • Haario, H., Saksman, E., Tamminen, J.: An adaptive Metropolis algorithm. Bernoulli 7(2), 223–242 (2001)

    MathSciNet  MATH  Google Scholar 

  • Hesterberg, T.: Weighted average importance sampling and defensive mixture distributions. Technometrics 37, 185–194 (1995)

    Article  MATH  Google Scholar 

  • Iba, Y.: Population-based Monte Carlo algorithms. Trans. Jpn. Soc. Artif. Intell. 16(2), 279–286 (2000)

    Google Scholar 

  • Jasra, A., Holmes, C.C., Stephens, D.A.: Markov Chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling. Stat. Sci. 20(1), 50–67 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Jeffreys, H.: Theory of Probability. Oxford Classic Texts in the Physical Sciences. (3rd edn.), Oxford University Press, Oxford (1961)

    Google Scholar 

  • Lee, K., Marin, J.-M., Mengersen, K.L., Robert, C.P.: Bayesian inference on mixtures of distributions. In: Narasimha Sastry, N.S., Delampady, M., Rajeev, B. (eds.) Perspectives in Mathematical Sciences I: Probability and Statistics, pp. 165–202. World Scientific, Singapore (2009)

    Chapter  Google Scholar 

  • Liu, J.S.: Monte Carlo Strategies in Scientific Computing. Springer, New York (2001)

    MATH  Google Scholar 

  • Liu, J.S., Wong, W.H., Kong, A.: Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and sampling schemes. Biometrika 81, 27–40 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  • Marin, J.-M., Robert, C.P.: Bayesian Core. Springer, New York (2007a).

    MATH  Google Scholar 

  • Marin, J.-M., Robert, C.P.: Importance sampling methods for Bayesian discrimination between embedded models. In: Chen, M.-H., Dey, D.K., Müller, P., Sun, D., Ye, K. (eds.) Frontiers of Statistical Decision Making and Bayesian Analysis. Springer, New York (2007b); To appear, see arXiv:0910.2325.

    Google Scholar 

  • Marjoram, P., Molitor, J., Plagnol, V., Tavaré, S.: Markov chain Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. USA 100(26), 15324–15328 (2003)

    Article  Google Scholar 

  • McCullagh, P., Nelder, J.: Generalized Linear Models. Chapman and Hall, New York (1989)

    MATH  Google Scholar 

  • Meng, X.L., Wong, W.H.: Simulating ratios of normalizing constants via a simple identity: a theoretical exploration. Stat. Sinica. 6, 831–860 (1996)

    MathSciNet  MATH  Google Scholar 

  • Metropolis, N., Ulam, S.: The Monte Carlo method. J. Am. Stat. Assoc. 44, 335–341 (1949)

    Article  MathSciNet  MATH  Google Scholar 

  • Neal, R.M.: Slice sampling (with discussion). Ann. Stat. 31, 705–767 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  • Nobile, A.: A hybrid Markov chain for the Bayesian analysis of the multinomial probit model. Stat. Comput. 8, 229–242 (1998)

    Article  Google Scholar 

  • Pole, A., West, M., Harrison, P.J.: Applied Bayesian Forecasting and Time Series Analysis. Chapman-Hall, New York (1994)

    MATH  Google Scholar 

  • Pritchard, J.K., Seielstad, M.T., Perez-Lezaun, A., Feldman, M.W.: Population growth of human Y chromosomes: a study of Y chromosome microsatellites. Mol. Biol. Evol. 16, 1791–1798 (1999)

    Article  Google Scholar 

  • Richardson, S., Green, P.J.: On Bayesian analysis of mixtures with an unknown number of components (with discussion). J. Roy. Stat. Soc. B. 59, 731–792 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • Robert, C.P.: The Bayesian Choice. paperback edn, Springer, New York (2007)

    Google Scholar 

  • Robert, C.P., Casella, G.: Monte Carlo Statistical Methods. (2nd edn.), Springer, New York (2004)

    MATH  Google Scholar 

  • Robert, C.P., Casella, G.: Introducing Monte Carlo Methods with R. Springer, New York (2009)

    Google Scholar 

  • Robert, C.P., Casella, G.: A history of Markov chain Monte Carlo-subjective recollections from incomplete data. In: Brooks, S., Gelman, A., Meng, X.L., Jones, G. (eds.) Handbook of Markov Chain Monte Carlo: Methods and Applications. Chapman and Hall, New York (2010); arXiv0808.2902.

    Google Scholar 

  • Robert, C.P., Marin, J.-M.: On resolving the Savage–Dickey paradox. Technical Report arxiv.org:0910.1452, CEREMADE, Université Paris Dauphine (2009)

    Google Scholar 

  • Robert, C.P., Wraith, D.: Computational methods for Bayesian model choice. In: Paul, M.G., Chun-Yong, C. (eds.) MaxEnt 2009 proceedings, vol. 1193, AIP (2009)

    Google Scholar 

  • Roberts, G.O., Rosenthal, J.S.: Examples of adaptive MCMC. J. Comp. Graph. Stat. 18, 349–367 (2009)

    Article  MathSciNet  Google Scholar 

  • Roeder, K.: Density estimation with confidence sets exemplified by superclusters and voids in galaxies. J. Am. Stat. Assoc. 85, 617–624 (1992)

    Article  MathSciNet  Google Scholar 

  • Rosenthal, J.S.: Amcm: An R interface for adaptive MCMC. Comput. Stat. Data Anal. 51, 5467–5470 (2007)

    Article  MATH  Google Scholar 

  • Shephard, N., Pitt, M.K.: Likelihood analysis of non-Gaussian measurement time series. Biometrika 84, 653–668 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • Skilling, J.: Nested sampling for general Bayesian computation. Bayesian Anal. 1(4), 833–860 (2006)

    Article  MathSciNet  Google Scholar 

  • Spiegelhalter, D.J., Thomas, A., Best, N.G.: WinBUGS Version 1.2 User Manual. Cambridge (1999)

    Google Scholar 

  • Stavropoulos, P., Titterington, D.M.: Improved particle filters and smoothing. In: Doucet, A., deFreitas, N., Gordon, N. (eds.) Sequential MCMC in Practice. Springer, New York (2001)

    Google Scholar 

  • Tanner, M., Wong, W.: The calculation of posterior distributions by data augmentation. J. Am. Stat. Assoc. 82, 528–550 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  • Verdinelli, I., Wasserman, L.: Computing Bayes factors using a generalization of the Savage–Dickey density ratio. J. Am. Stat. Assoc. 90, 614–618 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  • Wraith, D., Kilbinger, M., Benabed, K., Cappé, O., Cardoso, J.-F., Fort, G., Prunet, S., Robert, C.P.: Estimation of cosmological parameters using adaptive importance sampling. Phys. Rev. D. 80, 023507 (2009)

    Article  Google Scholar 

Download references

Acknowledgements

This work had been partly supported by the Agence Nationale de la Recherche (ANR, 212, rue de Bercy 75,012 Paris) through the 2009–2012 projects Big’MC and EMILE.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian P. Robert .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Robert, C.P. (2012). Bayesian Computational Methods. In: Gentle, J., Härdle, W., Mori, Y. (eds) Handbook of Computational Statistics. Springer Handbooks of Computational Statistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21551-3_26

Download citation

Publish with us

Policies and ethics