Abstract
Consider Bayesian variable selection in normal linear regression models based on Zellner’s \(g\)-prior. We study theoretical properties of this method when the sample size \(n\) grows and consider the cases when the number of regressors, \(p\) is fixed and when it grows with \(n\). We first consider the situation where the true model is not in the model space and prove under mild conditions that the method is consistent and “loss efficient” in appropriate sense. We then consider the case when the true model is in the model space and prove that the posterior probability of the true model goes to one as \(n\) goes to infinity. “Loss efficiency” is also proved in this situation. We give explicit conditions on the rate of growth of \(g\), possibly depending on that of \(p\) as \(n\) grows, for our results to hold. This helps in making recommendations for the choice of \(g\).
Similar content being viewed by others
References
Bayarri, M. J., Berger, J. O., Forte, A., García-Donato, G. (2012). Criteria for Bayesian model choice with application to variable selection. The Annals of Statistics, 40(3), 1550–1577. doi:10.1214/12-AOS1013.
Bornn, L., Doucet, A., Gottardo, R. (2010). An efficient computational approach for prior sensitivity analysis and cross-validation. The Canadian Journal of Statistics La Revue Canadienne de Statistique, 38(1), 47–64. doi:10.1002/cjs.10045.
Chakrabarti, A., Ghosh, J. K. (2006). A generalization of BIC for the general exponential family. Journal of Statistical Planning and Inference, 136(9), 2847–2872. doi:10.1016/j.jspi.2005.01.005.
Chakrabarti, A., Samanta, T. (2008). Asymptotic optimality of a cross-validatory predictive approach to linear model selection. In: Pushing the limits of contemporary statistics: contributions in honor of Jayanta K. Ghosh, Institute of Mathematical Statistics Collections, vol 3. Institute of Mathematical Statistics, Beachwood, OH, pp. 138–154, doi:10.1214/074921708000000110.
Chaturvedi, A., Hasegawa, H., Asthana, S. (1997). Bayesian analysis of the linear regression model with non-normal disturbances. The Australian Journal of Statistics, 39(3), 277–293. doi:10.1111/j.1467-842X.1997.tb00692.x.
Chipman, H., George, E.I., McCulloch, R.E. (2001). The practical implementation of Bayesian model selection. In: Model selection, IMS Lecture Notes - Monograph Series, vol 38, Institute of Mathematical Statistics, Beachwood, OH, pp. 65–134, doi:10.1214/lnms/1215540964, with discussion by M. Clyde, Dean P. Foster, and Robert A. Stine, and a rejoinder by the authors.
Consonni, G., Veronese, P. (2008). Compatibility of prior specifications across linear models. Statistical Science A Review Journal of the Institute of Mathematical Statistics, 23(3), 332–353. doi:10.1214/08-STS258.
Fernández, C., Ley, E., Steel, M. F. J. (2001). Benchmark priors for Bayesian model averaging. Journal of Econometrics, 100(2), 381–427. doi:10.1016/S0304-4076(00)00076-2.
Foster, D. P., George, E. I. (1994). The risk inflation criterion for multiple regression. The Annals of Statistics, 22(4), 1947–1975. doi:10.1214/aos/1176325766.
George, E. I. (2000). The variable selection problem. Journal of the American Statistical Association, 95(452), 1304–1308. doi:10.2307/2669776.
George, E. I., Foster, D. P. (2000). Calibration and empirical Bayes variable selection. Biometrika, 87(4), 731–747. doi:10.1093/biomet/87.4.731.
Kass, R. E., Wasserman, L. (1995). A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. Journal of the American Statistical Association, 90(431), 928–934. http://links.jstor.org/sici?sici=0162-1459(199509)90:431<928:ARBTFN>2.0.CO;2-B&origin=MSN.
Krishna, A., Bondell, H. D., Ghosh, S. K. (2009). Bayesian variable selection using an adaptive powered correlation prior. Journal of Statistical Planning and Inference, 139(8), 2665–2674. doi:10.1016/j.jspi.2008.12.004.
Li, K. C. (1987). Asymptotic optimality for \(C_p, C_L\), cross-validation and generalized cross-validation: discrete index set. The Annals of Statistics, 15(3), 958–975. doi: 10.1214/aos/1176350486.
Liang, F., Paulo, R., Molina, G., Clyde, M. A., Berger, J. O. (2008). Mixtures of \(g\) priors for Bayesian variable selection. Journal of the American Statistical Association, 103(481), 410–423. doi:10.1198/016214507000001337.
Maruyama, Y., George, E. I. (2011). Fully Bayes factors with a generalized \(g\)-prior. The Annals of Statistics, 39(5), 2740–2765. doi:10.1214/11-AOS917.
Miller, A. (2001). Subset selection in regression (2nd ed.). New York: Chapman and Hall.
Shang, Z., Clayton, M. K. (2011). Consistency of Bayesian linear model selection with a growing number of parameters. Journal of Statistical Planning and Inference, 141(11), 3463–3474. doi:10.1016/j.jspi.2011.05.002.
Shao, J. (1997). An asymptotic theory for linear model selection. Statistica Sinica, 7(2), 221–264, with comments and a rejoinder by the author.
Zellner, A. (1986). On assessing prior distributions and bayesian regression analysis with g-prior distributions. In: Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti (P. K. Goel and A. Zellner, eds.), pp. 233–243. Amsterdam: North-Holland.
Acknowledgments
We are thankful to the associate editor and the referees for their very useful comments and suggestions that helped us improve the paper.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
About this article
Cite this article
Mukhopadhyay, M., Samanta, T. & Chakrabarti, A. On consistency and optimality of Bayesian variable selection based on \(g\)-prior in normal linear regression models. Ann Inst Stat Math 67, 963–997 (2015). https://doi.org/10.1007/s10463-014-0483-8
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-014-0483-8