Orthogonalizations and Prior Distributions for Orthogonalized Model Mixing

  • Merlise Clyde
  • Giovanni Parmigiani


Prediction methods based on mixing over a set of plausible models can help alleviate the sensitivity of inference and decisions to modeling assumptions. One important application area is prediction in linear models. Computing techniques for model mixing in linear models include Markov chain Monte Carlo methods as well as importance sampling. Clyde, DeSimone and Parmigiani (1996) developed an importance sampling strategy based on expressing the space of predictors in terms of an orthogonal basis. This leads both to a better identified problem and to simple approximations to the posterior model probabilities. Such approximations can be used to construct efficient importance samplers. For brevity, we call this strategy orthogonalized model mixing.

Two key elements of orthogonalized model mixing are: a) the orthogonalization method and b) the prior probability distributions assigned to the models and the coefficients. In this paper we consider in further detail the specification of these two elements. In particular, after identifying the aspects of these specifications that are essential to the success of the importance sampler, we list and briefly discuss a number of different alternatives for both a) and b). We highlight the features that may make each one of the options attractive in specific situations and we illustrate some important points via a simulated data set.


Partial Little Square Prior Distribution Model Space Importance Sampling Principal Component Regression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bernardo, J.M. and Smith A.F.M. (1994) Bayesian Theory. Wiley, N.YCrossRefMATHGoogle Scholar
  2. Chipman, H (1996). Bayesian Variable Selection with Related Predictors. Canadian Journal of Statisticsto appearGoogle Scholar
  3. Clyde, MA, DeSimone, H, and Parmigiani, G (1996). Prediction via Orthogonalized Model Mixing. Journal of the American Statistical Association, forthcomingGoogle Scholar
  4. Draper, D (1995). Assessment and propagation of model uncertainty (with Discussion) . Journal of the Royal Statistical Society57, pp. 45–98MATHMathSciNetGoogle Scholar
  5. Foster, D, George, E and McCulloch, R (1995) Calibrating Bayesian variable selection proceduresGoogle Scholar
  6. Garthwaite, PH and Dickey, JM (1992). Elicitation of prior distributions for variable selection problems in regression. Annals of Statistics20, pp. 1697–1719Google Scholar
  7. Geisser, S, Predictive inference. An introduction, Chapman & Hall, New York, 1993Google Scholar
  8. George, E.I. (1986). Minimax multiple shrinkage estimation. Annals of Statistics14, pp. 188–205CrossRefMATHMathSciNetGoogle Scholar
  9. George, EI (1986). Combining minimax shrinkage estimators. Journal of the American Statistical Association81, pp. 437–445CrossRefMATHMathSciNetGoogle Scholar
  10. George, EI and McCulloch, R (1993). Variable Selection via Gibbs Sampling. Journal of the American Statistical Association88, pp. 881–889CrossRefGoogle Scholar
  11. George, EI and McCulloch, R (1994). Fast Bayes Variable Selection. TR, Graduate School of Business, University of ChicagoGoogle Scholar
  12. George, E.I. and Oman, S.D. (1993). Multiple shrinkage principal component regression. TR, University of Texas at AustinGoogle Scholar
  13. Geweke, JF (1994). Bayesian comparison of econometric models. Working Paper 532, Federal Reserve Bank of MinneapolisGoogle Scholar
  14. Green, PJ (1995). Reversible Jump MCMC Computation and Bayesian Model determination. TR-94-19, University of Bristol.Google Scholar
  15. Hoeting, J, Raftery, AE, Madigan, DM (1995), A Method for Simultaneous Variable Selection and Outlier Identification in Linear Regression, Technical Report 95–02, Colorado State UniversityGoogle Scholar
  16. Jolliffe, I.T. (1986) Principal Component Analysis, Springer, New YorkCrossRefGoogle Scholar
  17. Jolliffe, I.T. (1982). A note on the use of principal components in regression. Applied Statistics31, pp. 300–303CrossRefGoogle Scholar
  18. Kadane, J.B., Dickey, J.M., Winkler, R.L., Smith, W.S., and Peters, S.C. (1980). Interactive elicitation of opinion for a normal linear model. Journal of the American Statistical Association75, pp. 845–85CrossRefMathSciNetGoogle Scholar
  19. Laud, P and Ibrahim, JG (1994). Predictive Specification of Prior Model Probability in Variable Selection. TR, Division of Statistics, University of Northern IllinoisGoogle Scholar
  20. Li, K-C. (1991). Sliced inverse regression for dimension reduction. Journal of the American Statistical Association86, pp. 316–327CrossRefMATHMathSciNetGoogle Scholar
  21. Mitchell, T.J. and Beauchamp, J.J. (1988). Bayesian Variable Selection in Linear Regression. Journal of the American Statistical Association83, pp. 1023–1036CrossRefMATHMathSciNetGoogle Scholar
  22. Palm F.C. and Zellner A. (1992). “To Combine or not to Combine? Issues of Combining Forecasts,” Journal of Forecasting, 11, 687–701CrossRefGoogle Scholar
  23. Phillips, D.B. and Smith, A.F.M. (1994). Bayesian Model Comparison via Jump Diffusions. TR-94-20, Department of Mathematics, Imperial College, UKGoogle Scholar
  24. Raftery, AE, Madigan, DM, and Hoeting, J (1993). Model selection and accounting for model uncertainty in linear regression models. TR 262, Department of Statistics, University of WashingtonGoogle Scholar
  25. Raftery, A.E., Madigan, D.M., and Volinski C.T. (1995). Accounting for Model Uncertainty in Survival Analysis Improves Predictive Performance (with discussion). In Bayesian Statistics5, ed. J.M. Bernardo, J.O. Berger, A.P. Dawid and Smith, A.F.MGoogle Scholar
  26. Rao, C.R. (1964). The Use and Interpretation of Principal Components in Applied Research. Sankhya A26, pp. 329–358MATHGoogle Scholar
  27. Stone, M. and Brooks, R. J. (1990). Continuum regression: Cross-validated sequentially constructed prediction embracing ordinary least squares, partial least squares and principal components regression. Journal of the Royal Statistical Society, Series B52, pp. 237–269MATHMathSciNetGoogle Scholar
  28. Weisberg, S (1985). Applied Linear Regression. 2nd Edition. Wiley, New York.Google Scholar
  29. Wold, S., Ruhe, A., Wold, H., and Dunn, W.J.III (1984). The collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses. SI AM Journal on Scientific and Statistical Computing5, pp. 735–743CrossRefMATHGoogle Scholar
  30. Zellner, A., (1986). On assessing prior distribution in Bayesian regression analysis with g-prior distributions. In “Bayesian Inference and decision Techniques: Essays in Honor of Bruno de Finetti”, 233–243, ( P.K. Goel and A. Zellner eds.). North Holland, AmsterdamGoogle Scholar

Copyright information

© Springer Science+Business Media New York 1996

Authors and Affiliations

  • Merlise Clyde
    • 1
  • Giovanni Parmigiani
  1. 1.Institute of Statistics and Decision SciencesDuke UniversityDurhamUSA

Personalised recommendations