Abstract
High correlation among predictors has long been an annoyance in regression analysis. The crux of the problem is that the linear regression model assumes each predictor has an independent effect on the response that can be encapsulated in the predictor’s regression coefficient. When predictors are highly correlated, the data do not contain much information on the independent effects of each predictor. The high correlation among predictors can result in large standard errors for the regression coefficients and coefficients with signs opposite of what is expected based on a priori, subject-matter theory. We propose a Bayesian model that accounts for correlation among the predictors by simultaneously performing selection and clustering of the predictors. Our model combines a Dirichlet process prior and a variable selection prior for the regression coefficients. In our model highly correlated predictors can be grouped together by setting their corresponding coefficients exactly equal. Similarly, redundant predictors can be removed from the model through the variable selection component of our prior. We demonstrate the competitiveness of our method through simulation studies and analysis of real data.
Similar content being viewed by others
References
Akaike, H., 1973. Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory, Petrov, B.N., Csaki, F. (Editors).
Belsley, D.A., 1984. Demeaning conditioning diagnostics through centering (with discussion). The American Statistician, 38, 73–93.
Belsley, D.A., 1991. Conditioning Diagnostics: Collinearity and Weak Data in Regression. Wiley.
Belsley, D.A., Kuh, E., Welsch, R.E., 1980. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. Wiley.
Blackwell, D., MacQueen, J.B., 1973. Ferguson distributions via polya urn schemes. Annals of Statistics, 1, 353–355.
Blanchard, O.J., 1987. Comment. Journal of Business and Economic Statisitics, 5, 449–451.
Bondell, H.D., Reich, B.J., 2008. Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with oscar. Biometrics, 64, 115–123.
Bondell, H.D., Reich, B.J., 2009. Simultaneous factor selection and collapsing levels in anova. Biometrics, 65, 169–177.
Buse, A., 1994. Brickmaking and the collinear arts: A cautionary tale. Canadian Journal of Economics, 27, 408–414.
Ehrlich, I., 1973. Participation in illegitimate activities: a theoretical and empirical investigation. Journal of Political Economy, 81, 521–567.
Ehrlich, I., 197. The deterrent effect of capital punishment: a question of life or death. American Economic Review, 65, 397–417.
Ferguson, T., 1973. A Bayesian analysis of some nonparmetric problems. Annals of Statistics, 1(2), 209–230.
Gelman, A., Rubin, D., 1992. Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457–511.
George, E.I., McCulloch, R.E., 1993. Variable selection via Gibbs sampling. Journal of the American Statistical Association, 88, 881–889.
Ghosh, J.K., Ramamoorthi, R.V., 2003. Bayesian Nonparametrics. Springer.
Goldberger, A.S., 1991. A Course in Econometrics. Harvard University Press.
Gopalan, R., Berry, D.A., 1998. Bayesian multiple comparisons using Dirichlet process priors. Journal of the American Statistical Association, 93, 1130–1139.
Hald, A., 1952. Statistical Theory with Engineering Applications. Wiley, New York.
Hill, R.C., Adkins, L.C., 2003. Collinearity. In A Companion to Theoretical Econometrics, Baltagi, B.H. (Editor), Chapter 12, 256–278, Blackwell Publishing.
Hoerl, A.E., Kennard, R.W., 1970. Ridge regression: Biased estimation for nonorthogonal problems. Tech-nometrics, 12, 55–67.
Ishwaran, H., Zarepour, M., 2002. Exact and approximate sum representations for the dirichlet process. The Canadian Journal of Statistics, 30(2), 269–283.
Kennedy, P., 1982. Eliminating problems caused by multicollinearity: A warning. Journal of Economic Education, 13, 62–64.
Kennedy, P.E., 1983. On an inappropriate means of reducing multicollinearity. Regional Science and Urban Economics, 13, 579–581.
Kim, S., Dahl, D.B., Vannucci, M., 2009. Spiked dirichlet process prior for bayesian multiple hypothesis testing in random effects models. Bayesian Analysis, 4(4), 707–732.
MacLehose, R.F., Dunson, D.B., Herring, A.H., Hoppin J.A., 2007. Bayesian methods for highly correlated exposure data. Epidemiology, 18, 199–2007.
McQuarrie, A.D.R., Tsai C., 1998. Regression and Time Series Model Selection. World Scientific.
Neal, R.M., 2003. Density modeling and clustering using dirichlet diffusion trees. In Bayesian Statistics, 7, 619–629. Oxford University Press.
Nott, D.J., 2008. Predictive performance of dirichlet process shrinkage methods in linear regression. Computational Statistics & Data Analysis, 52(7), 3658–3669.
Plummer, M., Best, N., Cowles, K., Vines K., 2007. coda: Output analysis and diagnostics for MCMC. R package, version 0.12–1.
Segal, M.R., Dahlquist, K.D., Conklin, B.R., 2003. Regression approaches for microarray data analysis. Journal of Computational Biology, 10, 961–980.
Sturtz, S., Ligges, U., Gelman, A., 2005. R2winbugs: A package for running winbugs from R. Journal of Statistical Software, 12(3), 1–16.
Theil, H., 1963. On the use of incomplete prior information in regression analysis. Journal of the American Statistical Association, 58, 401–414.
Theil, H., Goldberger, A.S., 1961. On pure and mixed statistical estimation in economics. International Economic Review, 2(1), 65–78.
Thomas, A., O’Hara, B., Ligges, U., Sturtz, S., 2006. Making BUGS open. R News, 6, 12–17.
Tibshirani, R., 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58(1), 267–288.
Vandaele, W., 1978. Participation in illegitimate activities: Ehrlich revisited. In Deterrence and Incapacitation, Blumstein, A., Cohen, J. and Nagin, D. (Editors), 270–335, National Academy of Sciences Press.
Venables, W.N., Ripley B.D., 2002. Modern Applied Statistics with S. Springer.
West, M., 2003. Bayesian factor regression in the ‘large p small n’ problem. In Bayesian Statistics, 7, 733–743. Oxford University Press.
Woods, H., Steinour, H.H., Starke, H.R., 1932. Effect of composition of portland cement on heat evolved during hardening. Industrial Engineering and Chemistry, 24, 1207–1214.
Zou, H., Hastie, T., 2005. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B, 67(2), 301–320.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
McKay Curtis, S., Ghosh, S.K. A Bayesian Approach to Multicollinearity and the Simultaneous Selection and Clustering of Predictors in Linear Regression. J Stat Theory Pract 5, 715–735 (2011). https://doi.org/10.1080/15598608.2011.10483741
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1080/15598608.2011.10483741