Sankhya B

, Volume 80, Issue 1, pp 60–84 | Cite as

A Variant of AIC Based on the Bayesian Marginal Likelihood

  • Yuki Kawakubo
  • Tatsuya Kubokawa
  • Muni S. Srivastava


We propose information criteria that measure the prediction risk of a predictive density based on the Bayesian marginal likelihood from a frequentist point of view. We derive criteria for selecting variables in linear regression models, assuming a prior distribution of the regression coefficients. Then, we discuss the relationship between the proposed criteria and related criteria. There are three advantages of our method. First, this is a compromise between the frequentist and Bayesian standpoints because it evaluates the frequentist’s risk of the Bayesian model. Thus, it is less influenced by a prior misspecification. Second, the criteria exhibits consistency when selecting the true model. Third, when a uniform prior is assumed for the regression coefficients, the resulting criterion is equivalent to the residual information criterion (RIC) of Shi and Tsai (J. R. Stat. Soc. Ser. B 64, 237–252 2002).

Keywords and phrases

AIC BIC Consistency Kullback–Leibler divergence Linear regression model Residual information criterion Variable selection 

AMS (2000) subject classification.

Primary 62J05 Secondary 62F12 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



The authors are grateful to the associate editor and the anonymous referee for their valuable comments and helpful suggestions. The first and second authors were supported, in part, by Grant-in-Aid for Scientific Research from the Japan Society for the Promotion of Science (JSPS). The third author was supported, in part, by NSERC of Canada.


  1. Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In 2Nd International Symposium on Information Theory. B.N. Petrov, F. Csaki, (eds.). Akademiai Kiado, Budapest, pp. 267–281.Google Scholar
  2. Akaike, H. (1974). A new look at the statistical model identification. System identification and time-series analysis. IEEE Trans. Autom. Control AC-19, 716–723.CrossRefMATHGoogle Scholar
  3. Akaike, H. (1980a). On the use of predictive likelihood of a Gaussian model. Ann. Inst. Stat. Math. 32, 311–324.MathSciNetCrossRefMATHGoogle Scholar
  4. Akaike, H. (1980b). Likelihood and sdure. In Bayesian statistics. N.J. Bernard, M.H. Degroot, D.V. Lindaley, A.F.M. Simith, (eds.). University Press, Valencia, pp. 141–166.Google Scholar
  5. Ando, T. (2007). Bayesian predictive information criterion for the evaluation of hierarchical Bayesian and empirical Bayes models. Biometrika 94, 443–458.MathSciNetCrossRefMATHGoogle Scholar
  6. Battese, G.E., Harter, R.M. and Fuller, W.A. (1988). An error-components model for prediction of county crop areas using survey and satellite data. J. Am. Stat. Assoc. 83, 28–36.CrossRefGoogle Scholar
  7. Bozdogan, H. (1987). Model selection and Akaike’s information criterion (AIC): The general theory and its analytical extensions. Psychometrika 52, 345–370.MathSciNetCrossRefMATHGoogle Scholar
  8. Dellaportas, P., Forster, J.J. and Ntzoufras, I. (1997). On Bayesian model and variable selection using MCMC. Technical Report, Department of Statistics Athens University of Economics and Business, Athens.MATHGoogle Scholar
  9. George, E.I. and McCulloch, R.E. (1993). Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 88, 881–889.CrossRefGoogle Scholar
  10. George, E.I. and McCulloch, R.E. (1997). Approaches for Bayesian variable selection. Stat. Sin. 7, 339–373.MATHGoogle Scholar
  11. Henderson, C.R. (1950). Estimation of genetic parameters. Ann. Math. Stat. 21, 309–310.Google Scholar
  12. Hurvich, C.M. and Tsai, C.-L. (1989). Regression and time series model selection in small samples. Biometrika 76, 297–307.MathSciNetCrossRefMATHGoogle Scholar
  13. Kitagawa, G. (1997). Information criteria for the predictive evaluation of Bayesian models. Communications in Statistics — Theory and Methods 26, 2223–2246.MathSciNetCrossRefMATHGoogle Scholar
  14. Kuo, L. and Mallick, B. (1998). Variable selection for regression models. Sankhya, series B 60, 65–81.MathSciNetMATHGoogle Scholar
  15. Nishii, R. (1984). Asymptotic properties of criteria for selection of variables in multiple regression. Ann. Stat. 12, 758–765.MathSciNetCrossRefMATHGoogle Scholar
  16. O’Hara, R.B. and Sillanpaa, M.J. (2009). A review of Bayesian variable selection methods: what, how and which. Bayesian Anal. 4, 85–118.MathSciNetCrossRefMATHGoogle Scholar
  17. Patterson, H.D. and Thompson, R. (1971). Recovery of inter-block information when block sizes are unequal. Biometrika 58, 545–554.MathSciNetCrossRefMATHGoogle Scholar
  18. Schwarz, G. (1978). Estimating the dimension of a model. Ann. Stat. 6, 461–464.MathSciNetCrossRefMATHGoogle Scholar
  19. Shao, J. (1997). An asymptotic theory for linear model selection. Stat. Sin. 7, 221–264.MathSciNetMATHGoogle Scholar
  20. Shi, P. and Tsai, C.-L. (2002). Regression model selection—a residual likelihood approach. J. R. Stat. Soc. Ser. B 64, 237–252.MathSciNetCrossRefMATHGoogle Scholar
  21. Shibata, R. (1981). An optimal selection of regression variables. Biometrika 68, 45–54.MathSciNetCrossRefMATHGoogle Scholar
  22. Spiegelhalter, D.J., Best, N.G., Carlin, B.P. and van der Linde, A. (2002). Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. B 64, 583–639.MathSciNetCrossRefMATHGoogle Scholar
  23. Srivastava, M.S. and Kubokawa, T. (2010). Conditional information criteria for selecting variables in linear mixed models. J. Multivar. Anal. 101, 1970–1980.MathSciNetCrossRefMATHGoogle Scholar
  24. Sugiura, N. (1978). Further analysis of the data by Akaike’s information criterion and the finite corrections. Communications in Statistics — Theory and Methods 7, 13–26.MathSciNetCrossRefMATHGoogle Scholar
  25. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267–288.MathSciNetMATHGoogle Scholar
  26. Vaida, F. and Blanchard, S. (2005). Conditional Akaike information for mixed-effects models. Biometrika 92, 351–370.MathSciNetCrossRefMATHGoogle Scholar
  27. Xu, X. and Ghosh, M. (2015). Bayesian variable selection and estimation for group lasso. Bayesian Anal. 10, 909–936.MathSciNetCrossRefMATHGoogle Scholar
  28. Zellner, A. (1986). On assessing prior distributions and Bayesian regression analysis with G-Prior distributions. In Bayesian Inference and Decision Techniques: Essays in Honor of Bruno De Finetti. P.K. Goel, A. Zellner, (eds.). North-Holland/Elsevier, Amsterdam, pp. 233–243.Google Scholar

Copyright information

© Indian Statistical Institute 2018

Authors and Affiliations

  • Yuki Kawakubo
    • 1
  • Tatsuya Kubokawa
    • 2
  • Muni S. Srivastava
    • 3
  1. 1.Graduate School of Social SciencesChiba UniversityChibaJapan
  2. 2.Faculty of EconomicsUniversity of TokyoTokyoJapan
  3. 3.Department of StatisticsUniversity of TorontoTorontoCanada

Personalised recommendations