Bayesian Computational Methods

Robert, Christian P.

doi:10.1007/978-3-642-21551-3_26

Christian P. Robert⁴

Part of the book series: Springer Handbooks of Computational Statistics ((SHCS))

12k Accesses
1 Citations

Abstract

If, in the mid 1980s, one had asked the average statistician about the difficulties of using Bayesian Statistics, the most likely answer would have been “Well, there is this problem of selecting a prior distribution and then, even if one agrees on the prior, the whole Bayesian inference is simply impossible to implement in practice!” The same question asked in the Twenty first Century does not produce the same reply, but rather a much less aggressive complaint about the lack of generic software (besides winBUGS), along with the renewed worry of subjectively selecting a prior! The last 20 years have indeed witnessed a tremendous change in the way Bayesian Statistics are perceived, both by mathematical statisticians and by applied statisticians and the impetus behind this change has been a prodigious leap-forward in the computational abilities. The availability of very powerful approximation methods has correlatively freed Bayesian modelling, in terms of both model scope and prior modelling. This opening has induced many more scientists from outside the statistics community to opt for a Bayesian perspective as they can now handle those tools on their own. As discussed below, a most successful illustration of this gained freedom can be seen in Bayesian model choice, which was only emerging at the beginning of the MCMC era, for lack of appropriate computational tools.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In this chapter, the denomination universal is used in the sense of uniformly over all distributions.
2.
To impose the stationarity constraint when the order of the AR(p) model varies, it is necessary to reparameterise this model in terms of either the partial autocorrelations or of the roots of the associated lag polynomial. (See, e.g., Robert 2007, Sect. 4.5.)
3.
In this presentation of Bayes factors, we completely bypass the methodological difficulty of defining π(θ ∈ Θ ₀) when Θ ₀ is of measure 0 for the original prior π and refer the reader to Robert (2007, Sect. 5.2.3) and Marin and Robert (2007, Sect. 2.3.2) for proper coverage of this issue.
4.
The prior distribution can be used for importance sampling only if it is a proper prior and not a σ-finite measure.
5.
The constant order of the Monte Carlo error does not imply that the computational effort remains the same as the dimension increases, most obviously, but rather that the decrease (with m) in variation has the rate \(1/\sqrt{m}\).
6.
The empirical (Monte Carlo) confidence interval is not to be confused with the asymptotic confidence interval derived from the normal approximation. As discussed in Robert and Casella (2004, Chap. 4), these two intervals may differ considerably in width, with the interval derived from the CLT being much more optimistic!
7.
An alternative to the simulation from one \(\mathcal{T} (\nu,{x}_{i}, 1)\) distribution that does not require an extensive study on the most appropriate x _i is to use a mixture of the \(\mathcal{T} (\nu,{x}_{i}, 1)\) distributions. As seen in Sect. 26.5.2, the weights of this mixture can even be optimised automatically.
8.
We stress the point that this is mostly an academic exercise as, in regular settings, it is rarely the case that independent components are used for the importance function.
9.
Sect. 26.4.3 covers in greater details the setting of varying dimension problems, with the same theme that completion distributions and parameters are necessary but influential for the performances of the approximation.
10.
Even in the simple case of the probit model, MCMC algorithms do not always converge very quickly, as shown in Robert and Casella (2004, Chap. 14).
11.
It is quite interesting to see that the mixture Gibbs sampler suffers from the same pathology as the EM algorithm, although this is not surprising given that it is based on the same completion scheme.
12.
This wealth of possible alternatives to the completion Gibbs sampler is a mixed blessing in that their range, for instance the scale of the random walk proposals, needs to be scaled properly to avoid inefficiencies.
13.
The difficulty with the infinite part of the problem is easily solved in that the setting is identical to simulation problems in (countable or uncountable) infinite spaces. When running simulations in those spaces, some values are never visited by the simulated Markov chain and the chances a value is visited is related with the probability of this value under the target distribution.
14.
Early proposals to solve the varying dimension problem involved saturation schemes where all the parameters for all models were updated deterministically (Carlin and Chib 1995), but they do not apply for an infinite collection of models and they need to be precisely calibrated to achieve a sufficient amount of moves between models.
15.
For a simple proof that the acceptance probability guarantees that the stationary distribution is π(k, θ^(k)), see Robert and Casella (2004, Sect. 11.2.2).
16.
In the birth acceptance probability, the factorials k! and (k + 1)! appear as the numbers of ways of ordering the k and k + 1 components of the mixtures. The ratio cancels with \(1/(k + 1)\), which is the probability of selecting a particular component for the death step.
17.
The “sequential” denomination in the sequential Monte Carlo methods thus refers to the algorithmic part, not to the statistical part.
18.
The generic Rao–Blackwellised improvement was introduced in the original MCMC paper of Gelfand and Smith (1990) and studied by Liu et al. (1994) and Casella and Robert (1996). More recent developments are proposed in Cornuet et al. (2009), in connection with adaptive algorithms like PMC.
19.
Using a Gaussian non-parametric kernel estimator amounts to (a) sampling from the x _i ^(t)’s with equal weights and (b) using a normal random walk move from the selected x _i ^(t), with standard deviation equal to the bandwidth of the kernel.
20.
When the survival rate of a proposal distribution is null, in order to avoid the complete removal of a given scale v _k, the corresponding number r _k of proposals with that scale is set to a positive value, like 1% of the sample size.
21.
An R package called mcsm has been developed in association with Robert and Casella (2009) for training about Monte Carlo methods.

References

Abowd, J., Kramarz, F., Margolis, D.: High-wage workers and high-wage firms. Econometrica 67, 251–333 (1999)
Google Scholar
Albert, J.: Bayesian Computation Using Minitab. Wadsworth Publishing Company (1996)
Google Scholar
Albert, J.H.: Bayesian Computation with R. Springer, New York, (2007)
Book MATH Google Scholar
Andrieu, C., Robert, C.P.: Controlled Markov chain Monte Carlo methods for optimal sampling. Technical Report 0125, Université Paris Dauphine (2001)
Google Scholar
Andrieu, C., Doucet, A., Robert, C.P.: Computational advances for and from Bayesian analysis. Stat. Sci. 19(1), 118–127 (2004)
Article MathSciNet MATH Google Scholar
Bauwens, L., Richard, J.F.: A 1-1 Poly-t random variable generator with application to Monte Carlo integration. J. Econometrics 29, 19–46 (1985)
Article MATH Google Scholar
Beaumont, M.A., Zhang, W., Balding, D.J.: Approximate Bayesian computation in population genetics. Genetics 162, 2025–2035 (2002)
Google Scholar
Beaumont, M.A., Cornuet, J.-M., Marin, J.-M., Robert, C.P.: Adaptive approximate Bayesian computation. Biometrika 96(4), 983–990 (2009)
MathSciNet MATH Google Scholar
Berkhof, J., van Mechelen, I., Gelman, A.: A Bayesian approach to the selection and testing of mixture models. Statistica Sinica 13, 423–442 (2003)
MathSciNet MATH Google Scholar
Blum, M.G.B., François, O.: Non-linear regression models for approximate Bayesian computation. Stat. Comput. 20, 63–73 (2010)
Article MathSciNet Google Scholar
Bortot, P., Coles, S.G., Sisson, S.A.: Inference for stereological extremes. J. Am. Stat. Assoc. 102, 84–92 (2007)
Article MathSciNet MATH Google Scholar
Cappé, O., Robert, C.P.: MCMC: Ten years and still running! J. Am. Stat. Assoc. 95(4), 1282–1286 (2000)
MATH Google Scholar
Cappé, O., Guillin, A., Marin, J.-M., Robert, C.P.: Population Monte Carlo. J. Comput. Graph. Stat. 13(4), 907–929 (2004)
Article Google Scholar
Cappé, O., Moulines, E., Rydén, T.: Inference in Hidden Markov Models. Springer, New York (2005)
MATH Google Scholar
Cappé, O., Douc, R., Guillin, A., Marin, J.-M., Robert, C.P.: Adaptive importance sampling in general mixture classes. Stat. Comput. 18, 447–459 (2008)
Article MathSciNet Google Scholar
Carlin, B.P., Chib, S.: Bayesian model choice through Markov chain Monte Carlo. J. Roy. Stat. Soc. B. 57(3), 473–484 (1995)
MATH Google Scholar
Casella, G., Robert, C.P.: Rao-Blackwellisation of sampling schemes. Biometrika 83(1), 81–94 (1996)
Article MathSciNet MATH Google Scholar
Celeux, G., Hurn, M.A., Robert, C.P.: Computational and inferential difficulties with mixtures posterior distribution. J. Am. Stat. Assoc. 95(3), 957–979 (2000)
Article MathSciNet MATH Google Scholar
Chen, M.H., Shao, Q.M., Ibrahim, J.G.: Monte Carlo Methods in Bayesian Computation. Springer, New York (2000)
MATH Google Scholar
Chib, S.: Marginal likelihood from the Gibbs output. J. Am. Stat. Assoc. 90, 1313–1321 (1995)
Article MathSciNet MATH Google Scholar
Chopin, N.: Inference and model choice for time-ordered hidden Markov models. J. Roy. Stat. Soc. B. 69(2), 269–284 (2007)
Article MathSciNet MATH Google Scholar
Chopin, N., Robert, Cp.P.: Properties of nested sampling. Biometrika (2010); To appear, see arXiv:0801.3887. 97, 741–755
Google Scholar
Cornuet, J.-M., Marin, J.-M., Mira, A., Robert, C.P.: Adaptive multiple importance sampling. Technical Report arXiv.org:0907.1254, CEREMADE, Université Paris, Dauphine (2009)
Google Scholar
Del Moral, P., Doucet, A., Jasra, A.: Sequential Monte Carlo samplers. J. Roy. Stat. Soc. B. 68(3), 411–436 (2006)
Article MATH Google Scholar
Dickey, J.M.: The weighted likelihood ratio, linear hypotheses on normal location parameters. Ann. Math. Stat. 42, 204–223 (1971)
Article MathSciNet MATH Google Scholar
Diebolt, J., Robert, C.P.: Estimation of finite mixture distributions by Bayesian sampling. J. Roy. Stat. Soc. B. 56, 363–375 (1994)
MathSciNet MATH Google Scholar
Doornik, J.A., Hendry, D.F., Shephard, N.: Computationally-intensive econometrics using a distributed matrix-programming language. Philo. Trans. Roy. Soc. London 360, 1245–1266 (2002)
Article MathSciNet MATH Google Scholar
Douc, R., Guillin, A., Marin, J.-M., Robert, C.P.: Convergence of adaptive mixtures of importance sampling schemes. Ann. Stat. 35(1), 420–448 (2007a)
Article MathSciNet MATH Google Scholar
Douc, R., Guillin, A., Marin, J.-M., Robert, C.P.: Minimum variance importance sampling via population Monte Carlo. ESAIM: Probab. Stat. 11, 427–447 (2007b)
Article MathSciNet MATH Google Scholar
Doucet, A., de Freitas, N., Gordon, N.: Sequential Monte Carlo Methods in Practice. Springer, New York (2001)
MATH Google Scholar
Frühwirth-Schnatter, S.: Markov chain Monte Carlo estimation of classical and dynamic switching and mixture models. J. Am. Stat. Assoc. 96(453), 194–209 (2001)
Article MATH Google Scholar
Frühwirth-Schnatter, S.: Estimating marginal likelihoods for mixture and Markov switching models using bridge sampling techniques. Econometrics J. 7(1), 143–167 (2004)
Article MATH Google Scholar
Frühwirth-Schnatter, S.: Finite Mixture and Markov Switching Models. Springer, New York (2006)
MATH Google Scholar
Gelfand, A.E., Smith, A.F.M.: Sampling based approaches to calculating marginal densities. J. Am. Stat. Assoc. 85, 398–409 (1990)
Article MathSciNet MATH Google Scholar
Gelman, A., Gilks, W.R., Roberts, G.O.: Efficient Metropolis jumping rules. In: Berger, J.O., Bernardo, J.M., Dawid, A.P., Lindley, D.V., Smith, A.F.M. (eds.) Bayesian Statistics 5, pp. 599–608. Oxford University Press, Oxford (1996)
Google Scholar
Geweke, J.: Using simulation methods for Bayesian econometric models: Inference, development, and communication (with discussion and rejoinder). Economet. Rev. 18, 1–126 (1999)
Article MathSciNet MATH Google Scholar
Geweke, J.: Interpretation and inference in mixture models: Simple MCMC works. Comput. Stat. Data Anal. 51(7), 3529–3550 (2007)
Article MathSciNet MATH Google Scholar
Gilks, W.R., Berzuini, C.: Following a moving target–Monte Carlo inference for dynamic Bayesian models. J. Roy. Stat. Soc. B. 63(1), 127–146 (2001)
Article MathSciNet MATH Google Scholar
Gilks, W.R., Thomas, A., Spiegelhalter, D.J.: A language and program for complex Bayesian modelling. The Statistician 43, 169–178 (1994)
Article Google Scholar
Gilks, W.R., Roberts, G.O., Sahu, S.K.: Adaptive Markov chain Monte Carlo. J. Am. Stat. Assoc. 93, 1045–1054 (1998)
Article MathSciNet MATH Google Scholar
Gordon, N., Salmond, J., Smith, A.F.M.: A novel approach to non-linear/non-Gaussian Bayesian state estimation. IEEE Proceedings on Radar and Signal Processing 140, 107–113 (1993)
Article Google Scholar
Green, P.J.: Reversible jump MCMC computation and Bayesian model determination. Biometrika 82(4), 711–732 (1995)
Article MathSciNet MATH Google Scholar
Haario, H., Saksman, E., Tamminen, J.: Adaptive proposal distribution for random walk Metropolis algorithm. Comput. Stat. 14(3), 375–395 (1999)
Article MATH Google Scholar
Haario, H., Saksman, E., Tamminen, J.: An adaptive Metropolis algorithm. Bernoulli 7(2), 223–242 (2001)
MathSciNet MATH Google Scholar
Hesterberg, T.: Weighted average importance sampling and defensive mixture distributions. Technometrics 37, 185–194 (1995)
Article MATH Google Scholar
Iba, Y.: Population-based Monte Carlo algorithms. Trans. Jpn. Soc. Artif. Intell. 16(2), 279–286 (2000)
Google Scholar
Jasra, A., Holmes, C.C., Stephens, D.A.: Markov Chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling. Stat. Sci. 20(1), 50–67 (2005)
Article MathSciNet MATH Google Scholar
Jeffreys, H.: Theory of Probability. Oxford Classic Texts in the Physical Sciences. (3rd edn.), Oxford University Press, Oxford (1961)
Google Scholar
Lee, K., Marin, J.-M., Mengersen, K.L., Robert, C.P.: Bayesian inference on mixtures of distributions. In: Narasimha Sastry, N.S., Delampady, M., Rajeev, B. (eds.) Perspectives in Mathematical Sciences I: Probability and Statistics, pp. 165–202. World Scientific, Singapore (2009)
Chapter Google Scholar
Liu, J.S.: Monte Carlo Strategies in Scientific Computing. Springer, New York (2001)
MATH Google Scholar
Liu, J.S., Wong, W.H., Kong, A.: Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and sampling schemes. Biometrika 81, 27–40 (1994)
Article MathSciNet MATH Google Scholar
Marin, J.-M., Robert, C.P.: Bayesian Core. Springer, New York (2007a).
MATH Google Scholar
Marin, J.-M., Robert, C.P.: Importance sampling methods for Bayesian discrimination between embedded models. In: Chen, M.-H., Dey, D.K., Müller, P., Sun, D., Ye, K. (eds.) Frontiers of Statistical Decision Making and Bayesian Analysis. Springer, New York (2007b); To appear, see arXiv:0910.2325.
Google Scholar
Marjoram, P., Molitor, J., Plagnol, V., Tavaré, S.: Markov chain Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. USA 100(26), 15324–15328 (2003)
Article Google Scholar
McCullagh, P., Nelder, J.: Generalized Linear Models. Chapman and Hall, New York (1989)
MATH Google Scholar
Meng, X.L., Wong, W.H.: Simulating ratios of normalizing constants via a simple identity: a theoretical exploration. Stat. Sinica. 6, 831–860 (1996)
MathSciNet MATH Google Scholar
Metropolis, N., Ulam, S.: The Monte Carlo method. J. Am. Stat. Assoc. 44, 335–341 (1949)
Article MathSciNet MATH Google Scholar
Neal, R.M.: Slice sampling (with discussion). Ann. Stat. 31, 705–767 (2003)
Article MathSciNet MATH Google Scholar
Nobile, A.: A hybrid Markov chain for the Bayesian analysis of the multinomial probit model. Stat. Comput. 8, 229–242 (1998)
Article Google Scholar
Pole, A., West, M., Harrison, P.J.: Applied Bayesian Forecasting and Time Series Analysis. Chapman-Hall, New York (1994)
MATH Google Scholar
Pritchard, J.K., Seielstad, M.T., Perez-Lezaun, A., Feldman, M.W.: Population growth of human Y chromosomes: a study of Y chromosome microsatellites. Mol. Biol. Evol. 16, 1791–1798 (1999)
Article Google Scholar
Richardson, S., Green, P.J.: On Bayesian analysis of mixtures with an unknown number of components (with discussion). J. Roy. Stat. Soc. B. 59, 731–792 (1997)
Article MathSciNet MATH Google Scholar
Robert, C.P.: The Bayesian Choice. paperback edn, Springer, New York (2007)
Google Scholar
Robert, C.P., Casella, G.: Monte Carlo Statistical Methods. (2nd edn.), Springer, New York (2004)
MATH Google Scholar
Robert, C.P., Casella, G.: Introducing Monte Carlo Methods with R. Springer, New York (2009)
Google Scholar
Robert, C.P., Casella, G.: A history of Markov chain Monte Carlo-subjective recollections from incomplete data. In: Brooks, S., Gelman, A., Meng, X.L., Jones, G. (eds.) Handbook of Markov Chain Monte Carlo: Methods and Applications. Chapman and Hall, New York (2010); arXiv0808.2902.
Google Scholar
Robert, C.P., Marin, J.-M.: On resolving the Savage–Dickey paradox. Technical Report arxiv.org:0910.1452, CEREMADE, Université Paris Dauphine (2009)
Google Scholar
Robert, C.P., Wraith, D.: Computational methods for Bayesian model choice. In: Paul, M.G., Chun-Yong, C. (eds.) MaxEnt 2009 proceedings, vol. 1193, AIP (2009)
Google Scholar
Roberts, G.O., Rosenthal, J.S.: Examples of adaptive MCMC. J. Comp. Graph. Stat. 18, 349–367 (2009)
Article MathSciNet Google Scholar
Roeder, K.: Density estimation with confidence sets exemplified by superclusters and voids in galaxies. J. Am. Stat. Assoc. 85, 617–624 (1992)
Article MathSciNet Google Scholar
Rosenthal, J.S.: Amcm: An R interface for adaptive MCMC. Comput. Stat. Data Anal. 51, 5467–5470 (2007)
Article MATH Google Scholar
Shephard, N., Pitt, M.K.: Likelihood analysis of non-Gaussian measurement time series. Biometrika 84, 653–668 (1997)
Article MathSciNet MATH Google Scholar
Skilling, J.: Nested sampling for general Bayesian computation. Bayesian Anal. 1(4), 833–860 (2006)
Article MathSciNet Google Scholar
Spiegelhalter, D.J., Thomas, A., Best, N.G.: WinBUGS Version 1.2 User Manual. Cambridge (1999)
Google Scholar
Stavropoulos, P., Titterington, D.M.: Improved particle filters and smoothing. In: Doucet, A., deFreitas, N., Gordon, N. (eds.) Sequential MCMC in Practice. Springer, New York (2001)
Google Scholar
Tanner, M., Wong, W.: The calculation of posterior distributions by data augmentation. J. Am. Stat. Assoc. 82, 528–550 (1987)
Article MathSciNet MATH Google Scholar
Verdinelli, I., Wasserman, L.: Computing Bayes factors using a generalization of the Savage–Dickey density ratio. J. Am. Stat. Assoc. 90, 614–618 (1995)
Article MathSciNet MATH Google Scholar
Wraith, D., Kilbinger, M., Benabed, K., Cappé, O., Cardoso, J.-F., Fort, G., Prunet, S., Robert, C.P.: Estimation of cosmological parameters using adaptive importance sampling. Phys. Rev. D. 80, 023507 (2009)
Article Google Scholar

Download references

Acknowledgements

This work had been partly supported by the Agence Nationale de la Recherche (ANR, 212, rue de Bercy 75,012 Paris) through the 2009–2012 projects Big’MC and EMILE.

Author information

Authors and Affiliations

Université Paris-Dauphine, CEREMADE, and CREST-INSEE, Paris, France
Christian P. Robert

Authors

Christian P. Robert
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian P. Robert .

Editor information

Editors and Affiliations

Dept. Computational & Data, Sciences, George Mason University, University Drive 4400, Fairfax, 22030-4444, Virginia, USA
James E. Gentle
L.v.Bortkiewicz Chair of Statistics, C.A.S.E. Centre f. Appl. Stat. & Econ., Humboldt-Universität zu Berlin, Unter den Linden 6, Berlin, 10099, Germany
Wolfgang Karl Härdle
Dept. Socioinformation, Okayama University, Ridai-cho 1-1, Okayama, 700-0005, Japan
Yuichi Mori

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Robert, C.P. (2012). Bayesian Computational Methods. In: Gentle, J., Härdle, W., Mori, Y. (eds) Handbook of Computational Statistics. Springer Handbooks of Computational Statistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21551-3_26

Download citation

DOI: https://doi.org/10.1007/978-3-642-21551-3_26
Published: 21 December 2011
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21550-6
Online ISBN: 978-3-642-21551-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics